AMD INVLPGB Linux Patch Performance Benchmarks for a future article looking at AMD broadcast TLB invalidation Linux kernel patches with the INVLPGB instruction on newer AMD Zen 3 processors.
HTML result view exported from: https://openbenchmarking.org/result/2412250-NE-AMDINVLPG56&grs .
AMD INVLPGB Linux Patch Performance Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution Linux 6.13 Git INVLPGB Patched AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads) Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) AMD 1Ah 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.13.0-rc4-phx-stock (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1024x768 6.13.0-rc4-phx-broadcast-tlb (x86_64) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / relatime,rw,stripe=64 / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 Java Details - OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10) Python Details - Python 3.12.7 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AMD INVLPGB Linux Patch Performance dacapobench: Zxing 1D/2D Barcode Image Processing dacapobench: Jython mariadb: oltp_update_index - 256 openfoam: drivaerFastback, Medium Mesh Size - Mesh Time cassandra: Writes apache-iotdb: 800 - 100 - 800 - 400 nginx: 500 quicksilver: CORAL2 P1 whisper-cpp: ggml-small.en - 2016 State of the Union npb: EP.C apache-iotdb: 800 - 100 - 800 - 400 clickhouse: 100M Rows Hits Dataset, Second Run apache-iotdb: 500 - 100 - 500 - 100 dacapobench: GraphChi speedb: Read Rand Write Rand whisper-cpp: ggml-medium.en - 2016 State of the Union renaissance: Savina Reactors.IO apache-iotdb: 500 - 100 - 800 - 400 rodinia: OpenMP Leukocyte dacapobench: Avrora AVR Simulation Framework clickhouse: 100M Rows Hits Dataset, Third Run apache-iotdb: 500 - 100 - 500 - 100 renaissance: Apache Spark PageRank apache-iotdb: 500 - 100 - 800 - 100 apache-iotdb: 800 - 100 - 500 - 100 renaissance: Apache Spark Bayes npb: BT.C apache-iotdb: 800 - 100 - 500 - 100 renaissance: ALS Movie Lens speedb: Rand Read apache-iotdb: 500 - 100 - 800 - 100 apache-iotdb: 500 - 100 - 500 - 400 npb: SP.C mariadb: oltp_read_write - 256 rocksdb: Read While Writing build-python: Default apache-iotdb: 500 - 100 - 800 - 400 dacapobench: BioJava Biological Data Framework build-python: Released Build, PGO + LTO Optimized clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache dacapobench: Apache Xalan XSLT mariadb: oltp_read_only - 256 speedb: Update Rand apache-iotdb: 800 - 100 - 500 - 400 build-llvm: Ninja memcached: 1:5 blender: Barbershop - CPU-Only npb: LU.C build-llvm: Unix Makefiles apache-iotdb: 500 - 100 - 500 - 400 npb: IS.D build-godot: Time To Compile namd: STMV with 1,066,628 Atoms dacapobench: Apache Kafka dacapobench: jMonkeyEngine renaissance: In-Memory Database Shootout namd: ATPase with 327,506 Atoms quicksilver: CORAL2 P2 xmrig: GhostRider - 1M relion: Basic - CPU npb: MG.C Linux 6.13 Git INVLPGB Patched 648 4320 113658 109.86392 438419 212.98 503771.76 34086667 225.41952 11180.11 141171394 764.71 47.85 2363 3054826 460.62129 5722.9 211.81 31.674 2585 751.84 96806809 2261.3 61.49 40.40 180.1 353192.16 117887210 18214.8 622084083 120738449 99520170 154232.79 172615 14641889 13.067 125423055 4519 190.912 738.99 816 37688 519867 121237884 90.415 3569817.66 124.03 312554.83 160.666 168.36 7001.42 80.687 4.21510 5046 6808 4511.6 12.22443 22056667 13706.9 214.442 145177.79 619 4158 117679 106.20884 452025 206.65 517615.54 34996667 219.87375 11450.17 144092202 749.62 47.03 2323 3106319 453.06996 5651.4 214.48 31.298 2556 759.44 97751377 2283.3 60.93 40.04 181.7 356244.91 118890170 18368.8 627286939 121681953 100250585 155286.47 173765 14737550 12.984 124631354 4547 191.972 735.32 812 37868 522238 120726010 90.775 3583145.46 124.48 311722.61 161.060 167.95 6984.65 80.831 4.20973 5050 6806 4512.9 12.22125 22056667 16061.4 211.380 147724.66 OpenBenchmarking.org
DaCapo Benchmark Java Test: Zxing 1D/2D Barcode Image Processing OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Zxing 1D/2D Barcode Image Processing Linux 6.13 Git INVLPGB Patched 140 280 420 560 700 SE +/- 4.04, N = 3 SE +/- 6.43, N = 3 648 619
DaCapo Benchmark Java Test: Jython OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Jython Linux 6.13 Git INVLPGB Patched 900 1800 2700 3600 4500 SE +/- 37.01, N = 15 SE +/- 10.53, N = 3 4320 4158
MariaDB Test: oltp_update_index - Threads: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_update_index - Threads: 256 Linux 6.13 Git INVLPGB Patched 30K 60K 90K 120K 150K SE +/- 190.75, N = 3 SE +/- 200.13, N = 3 113658 117679 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time Linux 6.13 Git INVLPGB Patched 20 40 60 80 100 109.86 106.21 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 5.0 Test: Writes Linux 6.13 Git INVLPGB Patched 100K 200K 300K 400K 500K SE +/- 5248.18, N = 3 SE +/- 2371.91, N = 3 438419 452025
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 50 100 150 200 250 SE +/- 2.18, N = 3 SE +/- 3.85, N = 3 212.98 206.65 MAX: 26561.37 MAX: 26883.6
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 Linux 6.13 Git INVLPGB Patched 110K 220K 330K 440K 550K SE +/- 1734.77, N = 3 SE +/- 1146.82, N = 3 503771.76 517615.54 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Quicksilver Input: CORAL2 P1 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 Linux 6.13 Git INVLPGB Patched 7M 14M 21M 28M 35M SE +/- 46666.67, N = 3 SE +/- 98713.95, N = 3 34086667 34996667 1. (CXX) g++ options: -fopenmp -O3 -march=native
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union Linux 6.13 Git INVLPGB Patched 50 100 150 200 250 SE +/- 0.39, N = 3 SE +/- 1.60, N = 3 225.42 219.87 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C Linux 6.13 Git INVLPGB Patched 2K 4K 6K 8K 10K SE +/- 27.23, N = 3 SE +/- 98.82, N = 15 11180.11 11450.17 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 870384.94, N = 3 SE +/- 1488492.23, N = 3 141171394 144092202
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run Linux 6.13 Git INVLPGB Patched 160 320 480 640 800 SE +/- 4.01, N = 3 SE +/- 1.96, N = 3 764.71 749.62 MIN: 69.85 / MAX: 8571.43 MIN: 69.28 / MAX: 6666.67
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 11 22 33 44 55 SE +/- 0.55, N = 3 SE +/- 0.36, N = 3 47.85 47.03 MAX: 12576.32 MAX: 12578.34
DaCapo Benchmark Java Test: GraphChi OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: GraphChi Linux 6.13 Git INVLPGB Patched 500 1000 1500 2000 2500 SE +/- 8.96, N = 3 SE +/- 15.84, N = 3 2363 2323
Speedb Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random Linux 6.13 Git INVLPGB Patched 700K 1400K 2100K 2800K 3500K SE +/- 5973.22, N = 3 SE +/- 22974.88, N = 3 3054826 3106319 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union Linux 6.13 Git INVLPGB Patched 100 200 300 400 500 SE +/- 3.15, N = 3 SE +/- 0.73, N = 3 460.62 453.07 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Renaissance Test: Savina Reactors.IO OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO Linux 6.13 Git INVLPGB Patched 1200 2400 3600 4800 6000 SE +/- 75.13, N = 3 SE +/- 52.24, N = 7 5722.9 5651.4 MIN: 5596.11 / MAX: 9196.98 MIN: 5405.73 / MAX: 9193.8
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 50 100 150 200 250 SE +/- 2.05, N = 3 SE +/- 0.65, N = 3 211.81 214.48 MAX: 26536.58 MAX: 26489.9
Rodinia Test: OpenMP Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Leukocyte Linux 6.13 Git INVLPGB Patched 7 14 21 28 35 SE +/- 0.33, N = 3 SE +/- 0.16, N = 3 31.67 31.30 1. (CXX) g++ options: -O2 -lOpenCL
DaCapo Benchmark Java Test: Avrora AVR Simulation Framework OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Avrora AVR Simulation Framework Linux 6.13 Git INVLPGB Patched 600 1200 1800 2400 3000 SE +/- 8.50, N = 3 SE +/- 19.37, N = 3 2585 2556
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run Linux 6.13 Git INVLPGB Patched 160 320 480 640 800 SE +/- 0.95, N = 3 SE +/- 3.07, N = 3 751.84 759.44 MIN: 70.34 / MAX: 6666.67 MIN: 69.04 / MAX: 8571.43
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 20M 40M 60M 80M 100M SE +/- 1081476.60, N = 3 SE +/- 291260.63, N = 3 96806809 97751377
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank Linux 6.13 Git INVLPGB Patched 500 1000 1500 2000 2500 SE +/- 9.21, N = 3 SE +/- 22.38, N = 3 2261.3 2283.3 MIN: 1572.27 / MAX: 2279.05 MIN: 1561.41 / MAX: 2328.07
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 14 28 42 56 70 SE +/- 0.36, N = 3 SE +/- 0.25, N = 3 61.49 60.93 MAX: 11313.55 MAX: 11306.84
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 9 18 27 36 45 SE +/- 0.37, N = 3 SE +/- 0.04, N = 3 40.40 40.04 MAX: 23837.32 MAX: 23821.12
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes Linux 6.13 Git INVLPGB Patched 40 80 120 160 200 SE +/- 0.85, N = 3 SE +/- 1.59, N = 3 180.1 181.7 MIN: 160.23 / MAX: 297.68 MIN: 164.62 / MAX: 284.35
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C Linux 6.13 Git INVLPGB Patched 80K 160K 240K 320K 400K SE +/- 3855.67, N = 3 SE +/- 3962.64, N = 5 353192.16 356244.91 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 1087988.19, N = 3 SE +/- 226557.07, N = 3 117887210 118890170
Renaissance Test: ALS Movie Lens OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens Linux 6.13 Git INVLPGB Patched 4K 8K 12K 16K 20K SE +/- 45.86, N = 3 SE +/- 114.89, N = 3 18214.8 18368.8 MIN: 17691.46 / MAX: 18303.77 MIN: 17674.24 / MAX: 18580.57
Speedb Test: Random Read OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read Linux 6.13 Git INVLPGB Patched 130M 260M 390M 520M 650M SE +/- 7230810.31, N = 3 SE +/- 1540848.02, N = 3 622084083 627286939 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 58310.64, N = 3 SE +/- 67599.70, N = 3 120738449 121681953
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 20M 40M 60M 80M 100M SE +/- 603733.82, N = 3 SE +/- 97805.09, N = 3 99520170 100250585
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C Linux 6.13 Git INVLPGB Patched 30K 60K 90K 120K 150K SE +/- 1469.84, N = 3 SE +/- 279.05, N = 3 154232.79 155286.47 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
MariaDB Test: oltp_read_write - Threads: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_read_write - Threads: 256 Linux 6.13 Git INVLPGB Patched 40K 80K 120K 160K 200K SE +/- 73.62, N = 3 SE +/- 93.11, N = 3 172615 173765 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing Linux 6.13 Git INVLPGB Patched 3M 6M 9M 12M 15M SE +/- 157207.18, N = 3 SE +/- 155717.24, N = 15 14641889 14737550 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Timed CPython Compilation Build Configuration: Default OpenBenchmarking.org Seconds, Fewer Is Better Timed CPython Compilation 3.10.6 Build Configuration: Default Linux 6.13 Git INVLPGB Patched 3 6 9 12 15 13.07 12.98
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 352556.17, N = 3 SE +/- 332877.17, N = 3 125423055 124631354
DaCapo Benchmark Java Test: BioJava Biological Data Framework OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: BioJava Biological Data Framework Linux 6.13 Git INVLPGB Patched 1000 2000 3000 4000 5000 SE +/- 20.17, N = 3 SE +/- 17.52, N = 3 4519 4547
Timed CPython Compilation Build Configuration: Released Build, PGO + LTO Optimized OpenBenchmarking.org Seconds, Fewer Is Better Timed CPython Compilation 3.10.6 Build Configuration: Released Build, PGO + LTO Optimized Linux 6.13 Git INVLPGB Patched 40 80 120 160 200 190.91 191.97
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache Linux 6.13 Git INVLPGB Patched 160 320 480 640 800 SE +/- 6.88, N = 3 SE +/- 4.99, N = 3 738.99 735.32 MIN: 69.85 / MAX: 7500 MIN: 68.03 / MAX: 7500
DaCapo Benchmark Java Test: Apache Xalan XSLT OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Apache Xalan XSLT Linux 6.13 Git INVLPGB Patched 200 400 600 800 1000 SE +/- 7.60, N = 6 SE +/- 2.00, N = 3 816 812
MariaDB Test: oltp_read_only - Threads: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_read_only - Threads: 256 Linux 6.13 Git INVLPGB Patched 8K 16K 24K 32K 40K SE +/- 17.06, N = 3 SE +/- 112.63, N = 3 37688 37868 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
Speedb Test: Update Random OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random Linux 6.13 Git INVLPGB Patched 110K 220K 330K 440K 550K SE +/- 210.08, N = 3 SE +/- 726.96, N = 3 519867 522238 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 875042.28, N = 3 SE +/- 1093112.93, N = 3 121237884 120726010
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja Linux 6.13 Git INVLPGB Patched 20 40 60 80 100 SE +/- 0.24, N = 3 SE +/- 0.11, N = 3 90.42 90.78
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 Linux 6.13 Git INVLPGB Patched 800K 1600K 2400K 3200K 4000K SE +/- 2641.20, N = 3 SE +/- 7706.14, N = 3 3569817.66 3583145.46 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only Linux 6.13 Git INVLPGB Patched 30 60 90 120 150 SE +/- 0.13, N = 3 SE +/- 0.12, N = 3 124.03 124.48
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C Linux 6.13 Git INVLPGB Patched 70K 140K 210K 280K 350K SE +/- 2270.42, N = 15 SE +/- 2363.19, N = 10 312554.83 311722.61 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles Linux 6.13 Git INVLPGB Patched 40 80 120 160 200 SE +/- 0.32, N = 3 SE +/- 0.59, N = 3 160.67 161.06
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 40 80 120 160 200 SE +/- 0.88, N = 3 SE +/- 0.71, N = 3 168.36 167.95 MAX: 26450.24 MAX: 26468.05
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Linux 6.13 Git INVLPGB Patched 1500 3000 4500 6000 7500 SE +/- 60.97, N = 3 SE +/- 68.35, N = 5 7001.42 6984.65 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile Linux 6.13 Git INVLPGB Patched 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.16, N = 3 80.69 80.83
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms Linux 6.13 Git INVLPGB Patched 0.9484 1.8968 2.8452 3.7936 4.742 SE +/- 0.00531, N = 3 SE +/- 0.00334, N = 3 4.21510 4.20973
DaCapo Benchmark Java Test: Apache Kafka OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Apache Kafka Linux 6.13 Git INVLPGB Patched 1100 2200 3300 4400 5500 SE +/- 4.26, N = 3 SE +/- 6.08, N = 3 5046 5050
DaCapo Benchmark Java Test: jMonkeyEngine OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: jMonkeyEngine Linux 6.13 Git INVLPGB Patched 1500 3000 4500 6000 7500 SE +/- 1.67, N = 3 SE +/- 1.67, N = 3 6808 6806
Renaissance Test: In-Memory Database Shootout OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout Linux 6.13 Git INVLPGB Patched 1000 2000 3000 4000 5000 SE +/- 55.34, N = 4 SE +/- 34.36, N = 3 4511.6 4512.9 MIN: 4295.26 / MAX: 5024.51 MIN: 4347.69 / MAX: 5072.64
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms Linux 6.13 Git INVLPGB Patched 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 12.22 12.22
Quicksilver Input: CORAL2 P2 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 Linux 6.13 Git INVLPGB Patched 5M 10M 15M 20M 25M SE +/- 12018.50, N = 3 SE +/- 18559.21, N = 3 22056667 22056667 1. (CXX) g++ options: -fopenmp -O3 -march=native
Xmrig Variant: GhostRider - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: GhostRider - Hash Count: 1M Linux 6.13 Git INVLPGB Patched 3K 6K 9K 12K 15K SE +/- 668.22, N = 15 SE +/- 32.37, N = 3 13706.9 16061.4 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 5.0 Test: Basic - Device: CPU Linux 6.13 Git INVLPGB Patched 50 100 150 200 250 SE +/- 3.72, N = 12 SE +/- 4.55, N = 12 214.44 211.38 1. (CXX) g++ options: -fPIC -std=c++14 -fopenmp -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -ljpeg -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Linux 6.13 Git INVLPGB Patched 30K 60K 90K 120K 150K SE +/- 2956.05, N = 15 SE +/- 1764.21, N = 3 145177.79 147724.66 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Phoronix Test Suite v10.8.5