AMD INVLPGB Linux Patch Performance Benchmarks for a future article looking at AMD broadcast TLB invalidation Linux kernel patches with the INVLPGB instruction on newer AMD Zen 3 processors.
HTML result view exported from: https://openbenchmarking.org/result/2412250-NE-AMDINVLPG56&grs&rdt .
AMD INVLPGB Linux Patch Performance Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution INVLPGB Patched Linux 6.13 Git AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads) Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) AMD 1Ah 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.13.0-rc4-phx-broadcast-tlb (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1024x768 6.13.0-rc4-phx-stock (x86_64) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / relatime,rw,stripe=64 / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 Java Details - OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10) Python Details - Python 3.12.7 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AMD INVLPGB Linux Patch Performance dacapobench: Zxing 1D/2D Barcode Image Processing dacapobench: Jython mariadb: oltp_update_index - 256 openfoam: drivaerFastback, Medium Mesh Size - Mesh Time cassandra: Writes apache-iotdb: 800 - 100 - 800 - 400 nginx: 500 quicksilver: CORAL2 P1 whisper-cpp: ggml-small.en - 2016 State of the Union npb: EP.C apache-iotdb: 800 - 100 - 800 - 400 clickhouse: 100M Rows Hits Dataset, Second Run apache-iotdb: 500 - 100 - 500 - 100 dacapobench: GraphChi speedb: Read Rand Write Rand whisper-cpp: ggml-medium.en - 2016 State of the Union renaissance: Savina Reactors.IO apache-iotdb: 500 - 100 - 800 - 400 rodinia: OpenMP Leukocyte dacapobench: Avrora AVR Simulation Framework clickhouse: 100M Rows Hits Dataset, Third Run apache-iotdb: 500 - 100 - 500 - 100 renaissance: Apache Spark PageRank apache-iotdb: 500 - 100 - 800 - 100 apache-iotdb: 800 - 100 - 500 - 100 renaissance: Apache Spark Bayes npb: BT.C apache-iotdb: 800 - 100 - 500 - 100 renaissance: ALS Movie Lens speedb: Rand Read apache-iotdb: 500 - 100 - 800 - 100 apache-iotdb: 500 - 100 - 500 - 400 npb: SP.C mariadb: oltp_read_write - 256 rocksdb: Read While Writing build-python: Default apache-iotdb: 500 - 100 - 800 - 400 dacapobench: BioJava Biological Data Framework build-python: Released Build, PGO + LTO Optimized clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache dacapobench: Apache Xalan XSLT mariadb: oltp_read_only - 256 speedb: Update Rand apache-iotdb: 800 - 100 - 500 - 400 build-llvm: Ninja memcached: 1:5 blender: Barbershop - CPU-Only npb: LU.C build-llvm: Unix Makefiles apache-iotdb: 500 - 100 - 500 - 400 npb: IS.D build-godot: Time To Compile namd: STMV with 1,066,628 Atoms dacapobench: Apache Kafka dacapobench: jMonkeyEngine renaissance: In-Memory Database Shootout namd: ATPase with 327,506 Atoms quicksilver: CORAL2 P2 xmrig: GhostRider - 1M relion: Basic - CPU npb: MG.C INVLPGB Patched Linux 6.13 Git 619 4158 117679 106.20884 452025 206.65 517615.54 34996667 219.87375 11450.17 144092202 749.62 47.03 2323 3106319 453.06996 5651.4 214.48 31.298 2556 759.44 97751377 2283.3 60.93 40.04 181.7 356244.91 118890170 18368.8 627286939 121681953 100250585 155286.47 173765 14737550 12.984 124631354 4547 191.972 735.32 812 37868 522238 120726010 90.775 3583145.46 124.48 311722.61 161.060 167.95 6984.65 80.831 4.20973 5050 6806 4512.9 12.22125 22056667 16061.4 211.380 147724.66 648 4320 113658 109.86392 438419 212.98 503771.76 34086667 225.41952 11180.11 141171394 764.71 47.85 2363 3054826 460.62129 5722.9 211.81 31.674 2585 751.84 96806809 2261.3 61.49 40.40 180.1 353192.16 117887210 18214.8 622084083 120738449 99520170 154232.79 172615 14641889 13.067 125423055 4519 190.912 738.99 816 37688 519867 121237884 90.415 3569817.66 124.03 312554.83 160.666 168.36 7001.42 80.687 4.21510 5046 6808 4511.6 12.22443 22056667 13706.9 214.442 145177.79 OpenBenchmarking.org
DaCapo Benchmark Java Test: Zxing 1D/2D Barcode Image Processing OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Zxing 1D/2D Barcode Image Processing INVLPGB Patched Linux 6.13 Git 140 280 420 560 700 SE +/- 6.43, N = 3 SE +/- 4.04, N = 3 619 648
DaCapo Benchmark Java Test: Jython OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Jython INVLPGB Patched Linux 6.13 Git 900 1800 2700 3600 4500 SE +/- 10.53, N = 3 SE +/- 37.01, N = 15 4158 4320
MariaDB Test: oltp_update_index - Threads: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_update_index - Threads: 256 INVLPGB Patched Linux 6.13 Git 30K 60K 90K 120K 150K SE +/- 200.13, N = 3 SE +/- 190.75, N = 3 117679 113658 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time INVLPGB Patched Linux 6.13 Git 20 40 60 80 100 106.21 109.86 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 5.0 Test: Writes INVLPGB Patched Linux 6.13 Git 100K 200K 300K 400K 500K SE +/- 2371.91, N = 3 SE +/- 5248.18, N = 3 452025 438419
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 INVLPGB Patched Linux 6.13 Git 50 100 150 200 250 SE +/- 3.85, N = 3 SE +/- 2.18, N = 3 206.65 212.98 MAX: 26883.6 MAX: 26561.37
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 INVLPGB Patched Linux 6.13 Git 110K 220K 330K 440K 550K SE +/- 1146.82, N = 3 SE +/- 1734.77, N = 3 517615.54 503771.76 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Quicksilver Input: CORAL2 P1 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 INVLPGB Patched Linux 6.13 Git 7M 14M 21M 28M 35M SE +/- 98713.95, N = 3 SE +/- 46666.67, N = 3 34996667 34086667 1. (CXX) g++ options: -fopenmp -O3 -march=native
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union INVLPGB Patched Linux 6.13 Git 50 100 150 200 250 SE +/- 1.60, N = 3 SE +/- 0.39, N = 3 219.87 225.42 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C INVLPGB Patched Linux 6.13 Git 2K 4K 6K 8K 10K SE +/- 98.82, N = 15 SE +/- 27.23, N = 3 11450.17 11180.11 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 INVLPGB Patched Linux 6.13 Git 30M 60M 90M 120M 150M SE +/- 1488492.23, N = 3 SE +/- 870384.94, N = 3 144092202 141171394
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run INVLPGB Patched Linux 6.13 Git 160 320 480 640 800 SE +/- 1.96, N = 3 SE +/- 4.01, N = 3 749.62 764.71 MIN: 69.28 / MAX: 6666.67 MIN: 69.85 / MAX: 8571.43
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 INVLPGB Patched Linux 6.13 Git 11 22 33 44 55 SE +/- 0.36, N = 3 SE +/- 0.55, N = 3 47.03 47.85 MAX: 12578.34 MAX: 12576.32
DaCapo Benchmark Java Test: GraphChi OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: GraphChi INVLPGB Patched Linux 6.13 Git 500 1000 1500 2000 2500 SE +/- 15.84, N = 3 SE +/- 8.96, N = 3 2323 2363
Speedb Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random INVLPGB Patched Linux 6.13 Git 700K 1400K 2100K 2800K 3500K SE +/- 22974.88, N = 3 SE +/- 5973.22, N = 3 3106319 3054826 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union INVLPGB Patched Linux 6.13 Git 100 200 300 400 500 SE +/- 0.73, N = 3 SE +/- 3.15, N = 3 453.07 460.62 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Renaissance Test: Savina Reactors.IO OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO INVLPGB Patched Linux 6.13 Git 1200 2400 3600 4800 6000 SE +/- 52.24, N = 7 SE +/- 75.13, N = 3 5651.4 5722.9 MIN: 5405.73 / MAX: 9193.8 MIN: 5596.11 / MAX: 9196.98
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 INVLPGB Patched Linux 6.13 Git 50 100 150 200 250 SE +/- 0.65, N = 3 SE +/- 2.05, N = 3 214.48 211.81 MAX: 26489.9 MAX: 26536.58
Rodinia Test: OpenMP Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Leukocyte INVLPGB Patched Linux 6.13 Git 7 14 21 28 35 SE +/- 0.16, N = 3 SE +/- 0.33, N = 3 31.30 31.67 1. (CXX) g++ options: -O2 -lOpenCL
DaCapo Benchmark Java Test: Avrora AVR Simulation Framework OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Avrora AVR Simulation Framework INVLPGB Patched Linux 6.13 Git 600 1200 1800 2400 3000 SE +/- 19.37, N = 3 SE +/- 8.50, N = 3 2556 2585
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run INVLPGB Patched Linux 6.13 Git 160 320 480 640 800 SE +/- 3.07, N = 3 SE +/- 0.95, N = 3 759.44 751.84 MIN: 69.04 / MAX: 8571.43 MIN: 70.34 / MAX: 6666.67
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 INVLPGB Patched Linux 6.13 Git 20M 40M 60M 80M 100M SE +/- 291260.63, N = 3 SE +/- 1081476.60, N = 3 97751377 96806809
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank INVLPGB Patched Linux 6.13 Git 500 1000 1500 2000 2500 SE +/- 22.38, N = 3 SE +/- 9.21, N = 3 2283.3 2261.3 MIN: 1561.41 / MAX: 2328.07 MIN: 1572.27 / MAX: 2279.05
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 INVLPGB Patched Linux 6.13 Git 14 28 42 56 70 SE +/- 0.25, N = 3 SE +/- 0.36, N = 3 60.93 61.49 MAX: 11306.84 MAX: 11313.55
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 INVLPGB Patched Linux 6.13 Git 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.37, N = 3 40.04 40.40 MAX: 23821.12 MAX: 23837.32
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes INVLPGB Patched Linux 6.13 Git 40 80 120 160 200 SE +/- 1.59, N = 3 SE +/- 0.85, N = 3 181.7 180.1 MIN: 164.62 / MAX: 284.35 MIN: 160.23 / MAX: 297.68
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C INVLPGB Patched Linux 6.13 Git 80K 160K 240K 320K 400K SE +/- 3962.64, N = 5 SE +/- 3855.67, N = 3 356244.91 353192.16 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 INVLPGB Patched Linux 6.13 Git 30M 60M 90M 120M 150M SE +/- 226557.07, N = 3 SE +/- 1087988.19, N = 3 118890170 117887210
Renaissance Test: ALS Movie Lens OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens INVLPGB Patched Linux 6.13 Git 4K 8K 12K 16K 20K SE +/- 114.89, N = 3 SE +/- 45.86, N = 3 18368.8 18214.8 MIN: 17674.24 / MAX: 18580.57 MIN: 17691.46 / MAX: 18303.77
Speedb Test: Random Read OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read INVLPGB Patched Linux 6.13 Git 130M 260M 390M 520M 650M SE +/- 1540848.02, N = 3 SE +/- 7230810.31, N = 3 627286939 622084083 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 INVLPGB Patched Linux 6.13 Git 30M 60M 90M 120M 150M SE +/- 67599.70, N = 3 SE +/- 58310.64, N = 3 121681953 120738449
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 INVLPGB Patched Linux 6.13 Git 20M 40M 60M 80M 100M SE +/- 97805.09, N = 3 SE +/- 603733.82, N = 3 100250585 99520170
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C INVLPGB Patched Linux 6.13 Git 30K 60K 90K 120K 150K SE +/- 279.05, N = 3 SE +/- 1469.84, N = 3 155286.47 154232.79 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
MariaDB Test: oltp_read_write - Threads: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_read_write - Threads: 256 INVLPGB Patched Linux 6.13 Git 40K 80K 120K 160K 200K SE +/- 93.11, N = 3 SE +/- 73.62, N = 3 173765 172615 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing INVLPGB Patched Linux 6.13 Git 3M 6M 9M 12M 15M SE +/- 155717.24, N = 15 SE +/- 157207.18, N = 3 14737550 14641889 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Timed CPython Compilation Build Configuration: Default OpenBenchmarking.org Seconds, Fewer Is Better Timed CPython Compilation 3.10.6 Build Configuration: Default INVLPGB Patched Linux 6.13 Git 3 6 9 12 15 12.98 13.07
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 INVLPGB Patched Linux 6.13 Git 30M 60M 90M 120M 150M SE +/- 332877.17, N = 3 SE +/- 352556.17, N = 3 124631354 125423055
DaCapo Benchmark Java Test: BioJava Biological Data Framework OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: BioJava Biological Data Framework INVLPGB Patched Linux 6.13 Git 1000 2000 3000 4000 5000 SE +/- 17.52, N = 3 SE +/- 20.17, N = 3 4547 4519
Timed CPython Compilation Build Configuration: Released Build, PGO + LTO Optimized OpenBenchmarking.org Seconds, Fewer Is Better Timed CPython Compilation 3.10.6 Build Configuration: Released Build, PGO + LTO Optimized INVLPGB Patched Linux 6.13 Git 40 80 120 160 200 191.97 190.91
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache INVLPGB Patched Linux 6.13 Git 160 320 480 640 800 SE +/- 4.99, N = 3 SE +/- 6.88, N = 3 735.32 738.99 MIN: 68.03 / MAX: 7500 MIN: 69.85 / MAX: 7500
DaCapo Benchmark Java Test: Apache Xalan XSLT OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Apache Xalan XSLT INVLPGB Patched Linux 6.13 Git 200 400 600 800 1000 SE +/- 2.00, N = 3 SE +/- 7.60, N = 6 812 816
MariaDB Test: oltp_read_only - Threads: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_read_only - Threads: 256 INVLPGB Patched Linux 6.13 Git 8K 16K 24K 32K 40K SE +/- 112.63, N = 3 SE +/- 17.06, N = 3 37868 37688 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
Speedb Test: Update Random OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random INVLPGB Patched Linux 6.13 Git 110K 220K 330K 440K 550K SE +/- 726.96, N = 3 SE +/- 210.08, N = 3 522238 519867 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Apache IoTDB Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 INVLPGB Patched Linux 6.13 Git 30M 60M 90M 120M 150M SE +/- 1093112.93, N = 3 SE +/- 875042.28, N = 3 120726010 121237884
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja INVLPGB Patched Linux 6.13 Git 20 40 60 80 100 SE +/- 0.11, N = 3 SE +/- 0.24, N = 3 90.78 90.42
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 INVLPGB Patched Linux 6.13 Git 800K 1600K 2400K 3200K 4000K SE +/- 7706.14, N = 3 SE +/- 2641.20, N = 3 3583145.46 3569817.66 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only INVLPGB Patched Linux 6.13 Git 30 60 90 120 150 SE +/- 0.12, N = 3 SE +/- 0.13, N = 3 124.48 124.03
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C INVLPGB Patched Linux 6.13 Git 70K 140K 210K 280K 350K SE +/- 2363.19, N = 10 SE +/- 2270.42, N = 15 311722.61 312554.83 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles INVLPGB Patched Linux 6.13 Git 40 80 120 160 200 SE +/- 0.59, N = 3 SE +/- 0.32, N = 3 161.06 160.67
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 INVLPGB Patched Linux 6.13 Git 40 80 120 160 200 SE +/- 0.71, N = 3 SE +/- 0.88, N = 3 167.95 168.36 MAX: 26468.05 MAX: 26450.24
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D INVLPGB Patched Linux 6.13 Git 1500 3000 4500 6000 7500 SE +/- 68.35, N = 5 SE +/- 60.97, N = 3 6984.65 7001.42 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile INVLPGB Patched Linux 6.13 Git 20 40 60 80 100 SE +/- 0.16, N = 3 SE +/- 0.05, N = 3 80.83 80.69
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms INVLPGB Patched Linux 6.13 Git 0.9484 1.8968 2.8452 3.7936 4.742 SE +/- 0.00334, N = 3 SE +/- 0.00531, N = 3 4.20973 4.21510
DaCapo Benchmark Java Test: Apache Kafka OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: Apache Kafka INVLPGB Patched Linux 6.13 Git 1100 2200 3300 4400 5500 SE +/- 6.08, N = 3 SE +/- 4.26, N = 3 5050 5046
DaCapo Benchmark Java Test: jMonkeyEngine OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 23.11 Java Test: jMonkeyEngine INVLPGB Patched Linux 6.13 Git 1500 3000 4500 6000 7500 SE +/- 1.67, N = 3 SE +/- 1.67, N = 3 6806 6808
Renaissance Test: In-Memory Database Shootout OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout INVLPGB Patched Linux 6.13 Git 1000 2000 3000 4000 5000 SE +/- 34.36, N = 3 SE +/- 55.34, N = 4 4512.9 4511.6 MIN: 4347.69 / MAX: 5072.64 MIN: 4295.26 / MAX: 5024.51
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms INVLPGB Patched Linux 6.13 Git 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 12.22 12.22
Quicksilver Input: CORAL2 P2 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 INVLPGB Patched Linux 6.13 Git 5M 10M 15M 20M 25M SE +/- 18559.21, N = 3 SE +/- 12018.50, N = 3 22056667 22056667 1. (CXX) g++ options: -fopenmp -O3 -march=native
Xmrig Variant: GhostRider - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: GhostRider - Hash Count: 1M INVLPGB Patched Linux 6.13 Git 3K 6K 9K 12K 15K SE +/- 32.37, N = 3 SE +/- 668.22, N = 15 16061.4 13706.9 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 5.0 Test: Basic - Device: CPU INVLPGB Patched Linux 6.13 Git 50 100 150 200 250 SE +/- 4.55, N = 12 SE +/- 3.72, N = 12 211.38 214.44 1. (CXX) g++ options: -fPIC -std=c++14 -fopenmp -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -ljpeg -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C INVLPGB Patched Linux 6.13 Git 30K 60K 90K 120K 150K SE +/- 1764.21, N = 3 SE +/- 2956.05, N = 15 147724.66 145177.79 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Phoronix Test Suite v10.8.5