Benchmarks for a future article looking at AMD broadcast TLB invalidation Linux kernel patches with the INVLPGB instruction on newer AMD Zen 3 processors.
Linux 6.13 Git Processor: AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads), Motherboard: Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS), Chipset: AMD 1Ah, Memory: 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.10, Kernel: 6.13.0-rc4-phx-stock (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / relatime,rw,stripe=64 / Block Size: 4096Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
INVLPGB Patched OS: Ubuntu 24.10, Kernel: 6.13.0-rc4-phx-broadcast-tlb (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1024x768
AMD INVLPGB Linux Patch Performance OpenBenchmarking.org Phoronix Test Suite AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads) Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) AMD 1Ah 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.13.0-rc4-phx-stock (x86_64) 6.13.0-rc4-phx-broadcast-tlb (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1024x768 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernels Desktop Display Server Compiler File-System Screen Resolution AMD INVLPGB Linux Patch Performance Benchmarks System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - NONE / relatime,rw,stripe=64 / Block Size: 4096 - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 - OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10) - Python 3.12.7 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Linux 6.13 Git vs. INVLPGB Patched Comparison Phoronix Test Suite Baseline +4.3% +4.3% +8.6% +8.6% +12.9% +12.9% 17.2% 4.7% 3.9% 3.5% 3.4% 3.1% 3.1% 2.7% 2.7% 2.5% 2.4% 2.1% GhostRider - 1M Z.1.2.B.I.P Jython oltp_update_index - 256 d.M.M.S - Mesh Time Writes 800 - 100 - 800 - 400 500 CORAL2 P1 ggml-small.en - 2.S.o.t.U EP.C 800 - 100 - 800 - 400 1.R.H.D.S.R 2% Xmrig DaCapo Benchmark DaCapo Benchmark MariaDB OpenFOAM Apache Cassandra Apache IoTDB nginx Quicksilver Whisper.cpp NAS Parallel Benchmarks Apache IoTDB ClickHouse Linux 6.13 Git INVLPGB Patched
AMD INVLPGB Linux Patch Performance relion: Basic - CPU whisper-cpp: ggml-medium.en - 2016 State of the Union mariadb: oltp_update_index - 256 mariadb: oltp_read_only - 256 mariadb: oltp_read_write - 256 xmrig: GhostRider - 1M whisper-cpp: ggml-small.en - 2016 State of the Union quicksilver: CORAL2 P2 rocksdb: Read While Writing build-llvm: Unix Makefiles apache-iotdb: 800 - 100 - 800 - 400 apache-iotdb: 800 - 100 - 800 - 400 clickhouse: 100M Rows Hits Dataset, Third Run clickhouse: 100M Rows Hits Dataset, Second Run clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache apache-iotdb: 800 - 100 - 500 - 400 cassandra: Writes blender: Barbershop - CPU-Only apache-iotdb: 500 - 100 - 800 - 400 apache-iotdb: 500 - 100 - 800 - 400 apache-iotdb: 800 - 100 - 500 - 100 apache-iotdb: 800 - 100 - 500 - 100 apache-iotdb: 500 - 100 - 800 - 100 apache-iotdb: 500 - 100 - 800 - 100 renaissance: Savina Reactors.IO openfoam: drivaerFastback, Medium Mesh Size - Mesh Time apache-iotdb: 500 - 100 - 500 - 400 apache-iotdb: 500 - 100 - 500 - 400 apache-iotdb: 500 - 100 - 500 - 100 apache-iotdb: 500 - 100 - 500 - 100 renaissance: ALS Movie Lens build-llvm: Ninja nginx: 500 renaissance: In-Memory Database Shootout build-godot: Time To Compile renaissance: Apache Spark Bayes renaissance: Apache Spark PageRank memcached: 1:5 build-python: Released Build, PGO + LTO Optimized speedb: Update Rand speedb: Read Rand Write Rand speedb: Rand Read dacapobench: Jython quicksilver: CORAL2 P1 npb: LU.C rodinia: OpenMP Leukocyte dacapobench: jMonkeyEngine dacapobench: Apache Kafka namd: STMV with 1,066,628 Atoms dacapobench: BioJava Biological Data Framework npb: BT.C dacapobench: GraphChi npb: SP.C npb: IS.D dacapobench: Avrora AVR Simulation Framework npb: MG.C namd: ATPase with 327,506 Atoms npb: EP.C dacapobench: Apache Xalan XSLT build-python: Default dacapobench: Zxing 1D/2D Barcode Image Processing Linux 6.13 Git INVLPGB Patched 214.442 460.62129 113658 37688 172615 13706.9 225.41952 22056667 14641889 160.666 212.98 141171394 751.84 764.71 738.99 121237884 438419 124.03 211.81 125423055 40.40 117887210 61.49 120738449 5722.9 109.86392 168.36 99520170 47.85 96806809 18214.8 90.415 503771.76 4511.6 80.687 180.1 2261.3 3569817.66 190.912 519867 3054826 622084083 4320 34086667 312554.83 31.674 6808 5046 4.21510 4519 353192.16 2363 154232.79 7001.42 2585 145177.79 12.22443 11180.11 816 13.067 648 211.380 453.06996 117679 37868 173765 16061.4 219.87375 22056667 14737550 161.060 206.65 144092202 759.44 749.62 735.32 120726010 452025 124.48 214.48 124631354 40.04 118890170 60.93 121681953 5651.4 106.20884 167.95 100250585 47.03 97751377 18368.8 90.775 517615.54 4512.9 80.831 181.7 2283.3 3583145.46 191.972 522238 3106319 627286939 4158 34996667 311722.61 31.298 6806 5050 4.20973 4547 356244.91 2323 155286.47 6984.65 2556 147724.66 12.22125 11450.17 812 12.984 619 OpenBenchmarking.org
RELION OpenBenchmarking.org Seconds, Fewer Is Better RELION 5.0 Test: Basic - Device: CPU Linux 6.13 Git INVLPGB Patched 50 100 150 200 250 SE +/- 3.72, N = 12 SE +/- 4.55, N = 12 214.44 211.38 1. (CXX) g++ options: -fPIC -std=c++14 -fopenmp -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -ljpeg -lmpi_cxx -lmpi
Whisper.cpp OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union Linux 6.13 Git INVLPGB Patched 100 200 300 400 500 SE +/- 3.15, N = 3 SE +/- 0.73, N = 3 460.62 453.07 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
MariaDB This is a MariaDB MySQL database server benchmark making use of sysbench rather than the existing pts/mysqlslap test profile that uses MariaDB with mysqlslap/mariadb-slap as the benchmark driver. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_update_index - Threads: 256 Linux 6.13 Git INVLPGB Patched 30K 60K 90K 120K 150K SE +/- 190.75, N = 3 SE +/- 200.13, N = 3 113658 117679 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_read_only - Threads: 256 Linux 6.13 Git INVLPGB Patched 8K 16K 24K 32K 40K SE +/- 17.06, N = 3 SE +/- 112.63, N = 3 37688 37868 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.5 Test: oltp_read_write - Threads: 256 Linux 6.13 Git INVLPGB Patched 40K 80K 120K 160K 200K SE +/- 73.62, N = 3 SE +/- 93.11, N = 3 172615 173765 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lpcre2-8 -lcrypt -laio -lz -lm -lssl -lcrypto -lpthread -ldl
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: GhostRider - Hash Count: 1M Linux 6.13 Git INVLPGB Patched 3K 6K 9K 12K 15K SE +/- 668.22, N = 15 SE +/- 32.37, N = 3 13706.9 16061.4 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Whisper.cpp OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union Linux 6.13 Git INVLPGB Patched 50 100 150 200 250 SE +/- 0.39, N = 3 SE +/- 1.60, N = 3 225.42 219.87 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 Linux 6.13 Git INVLPGB Patched 5M 10M 15M 20M 25M SE +/- 12018.50, N = 3 SE +/- 18559.21, N = 3 22056667 22056667 1. (CXX) g++ options: -fopenmp -O3 -march=native
RocksDB This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing Linux 6.13 Git INVLPGB Patched 3M 6M 9M 12M 15M SE +/- 157207.18, N = 3 SE +/- 155717.24, N = 15 14641889 14737550 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Apache IoTDB Apache IotDB is a time series database and this benchmark is facilitated using the IoT Benchmaark [https://github.com/thulab/iot-benchmark/]. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 50 100 150 200 250 SE +/- 2.18, N = 3 SE +/- 3.85, N = 3 212.98 206.65 MAX: 26561.37 MAX: 26883.6
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 870384.94, N = 3 SE +/- 1488492.23, N = 3 141171394 144092202
ClickHouse ClickHouse is an open-source, high performance OLAP data management system. This test profile uses ClickHouse's standard benchmark recommendations per https://clickhouse.com/docs/en/operations/performance-test/ / https://github.com/ClickHouse/ClickBench/tree/main/clickhouse with the 100 million rows web analytics dataset. The reported value is the query processing time using the geometric mean of all separate queries performed as an aggregate. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run Linux 6.13 Git INVLPGB Patched 160 320 480 640 800 SE +/- 0.95, N = 3 SE +/- 3.07, N = 3 751.84 759.44 MIN: 70.34 / MAX: 6666.67 MIN: 69.04 / MAX: 8571.43
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run Linux 6.13 Git INVLPGB Patched 160 320 480 640 800 SE +/- 4.01, N = 3 SE +/- 1.96, N = 3 764.71 749.62 MIN: 69.85 / MAX: 8571.43 MIN: 69.28 / MAX: 6666.67
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache Linux 6.13 Git INVLPGB Patched 160 320 480 640 800 SE +/- 6.88, N = 3 SE +/- 4.99, N = 3 738.99 735.32 MIN: 69.85 / MAX: 7500 MIN: 68.03 / MAX: 7500
Apache IoTDB Apache IotDB is a time series database and this benchmark is facilitated using the IoT Benchmaark [https://github.com/thulab/iot-benchmark/]. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 875042.28, N = 3 SE +/- 1093112.93, N = 3 121237884 120726010
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only Linux 6.13 Git INVLPGB Patched 30 60 90 120 150 SE +/- 0.13, N = 3 SE +/- 0.12, N = 3 124.03 124.48
Apache IoTDB Apache IotDB is a time series database and this benchmark is facilitated using the IoT Benchmaark [https://github.com/thulab/iot-benchmark/]. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 50 100 150 200 250 SE +/- 2.05, N = 3 SE +/- 0.65, N = 3 211.81 214.48 MAX: 26536.58 MAX: 26489.9
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 352556.17, N = 3 SE +/- 332877.17, N = 3 125423055 124631354
OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 9 18 27 36 45 SE +/- 0.37, N = 3 SE +/- 0.04, N = 3 40.40 40.04 MAX: 23837.32 MAX: 23821.12
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 1087988.19, N = 3 SE +/- 226557.07, N = 3 117887210 118890170
OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 14 28 42 56 70 SE +/- 0.36, N = 3 SE +/- 0.25, N = 3 61.49 60.93 MAX: 11313.55 MAX: 11306.84
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 30M 60M 90M 120M 150M SE +/- 58310.64, N = 3 SE +/- 67599.70, N = 3 120738449 121681953
Renaissance OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO Linux 6.13 Git INVLPGB Patched 1200 2400 3600 4800 6000 SE +/- 75.13, N = 3 SE +/- 52.24, N = 7 5722.9 5651.4 MIN: 5596.11 / MAX: 9196.98 MIN: 5405.73 / MAX: 9193.8
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time Linux 6.13 Git INVLPGB Patched 20 40 60 80 100 109.86 106.21 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
Apache IoTDB Apache IotDB is a time series database and this benchmark is facilitated using the IoT Benchmaark [https://github.com/thulab/iot-benchmark/]. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 40 80 120 160 200 SE +/- 0.88, N = 3 SE +/- 0.71, N = 3 168.36 167.95 MAX: 26450.24 MAX: 26468.05
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 400 Linux 6.13 Git INVLPGB Patched 20M 40M 60M 80M 100M SE +/- 603733.82, N = 3 SE +/- 97805.09, N = 3 99520170 100250585
OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 11 22 33 44 55 SE +/- 0.55, N = 3 SE +/- 0.36, N = 3 47.85 47.03 MAX: 12576.32 MAX: 12578.34
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 - Client Number: 100 Linux 6.13 Git INVLPGB Patched 20M 40M 60M 80M 100M SE +/- 1081476.60, N = 3 SE +/- 291260.63, N = 3 96806809 97751377
Renaissance OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens Linux 6.13 Git INVLPGB Patched 4K 8K 12K 16K 20K SE +/- 45.86, N = 3 SE +/- 114.89, N = 3 18214.8 18368.8 MIN: 17691.46 / MAX: 18303.77 MIN: 17674.24 / MAX: 18580.57
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 Linux 6.13 Git INVLPGB Patched 110K 220K 330K 440K 550K SE +/- 1734.77, N = 3 SE +/- 1146.82, N = 3 503771.76 517615.54 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Renaissance OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout Linux 6.13 Git INVLPGB Patched 1000 2000 3000 4000 5000 SE +/- 55.34, N = 4 SE +/- 34.36, N = 3 4511.6 4512.9 MIN: 4295.26 / MAX: 5024.51 MIN: 4347.69 / MAX: 5072.64
Renaissance OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes Linux 6.13 Git INVLPGB Patched 40 80 120 160 200 SE +/- 0.85, N = 3 SE +/- 1.59, N = 3 180.1 181.7 MIN: 160.23 / MAX: 297.68 MIN: 164.62 / MAX: 284.35
OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank Linux 6.13 Git INVLPGB Patched 500 1000 1500 2000 2500 SE +/- 9.21, N = 3 SE +/- 22.38, N = 3 2261.3 2283.3 MIN: 1572.27 / MAX: 2279.05 MIN: 1561.41 / MAX: 2328.07
Memcached Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 Linux 6.13 Git INVLPGB Patched 800K 1600K 2400K 3200K 4000K SE +/- 2641.20, N = 3 SE +/- 7706.14, N = 3 3569817.66 3583145.46 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random Linux 6.13 Git INVLPGB Patched 110K 220K 330K 440K 550K SE +/- 210.08, N = 3 SE +/- 726.96, N = 3 519867 522238 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random Linux 6.13 Git INVLPGB Patched 700K 1400K 2100K 2800K 3500K SE +/- 5973.22, N = 3 SE +/- 22974.88, N = 3 3054826 3106319 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read Linux 6.13 Git INVLPGB Patched 130M 260M 390M 520M 650M SE +/- 7230810.31, N = 3 SE +/- 1540848.02, N = 3 622084083 627286939 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 Linux 6.13 Git INVLPGB Patched 7M 14M 21M 28M 35M SE +/- 46666.67, N = 3 SE +/- 98713.95, N = 3 34086667 34996667 1. (CXX) g++ options: -fopenmp -O3 -march=native
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C Linux 6.13 Git INVLPGB Patched 70K 140K 210K 280K 350K SE +/- 2270.42, N = 15 SE +/- 2363.19, N = 10 312554.83 311722.61 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Leukocyte Linux 6.13 Git INVLPGB Patched 7 14 21 28 35 SE +/- 0.33, N = 3 SE +/- 0.16, N = 3 31.67 31.30 1. (CXX) g++ options: -O2 -lOpenCL
NAMD OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms Linux 6.13 Git INVLPGB Patched 0.9484 1.8968 2.8452 3.7936 4.742 SE +/- 0.00531, N = 3 SE +/- 0.00334, N = 3 4.21510 4.20973
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C Linux 6.13 Git INVLPGB Patched 80K 160K 240K 320K 400K SE +/- 3855.67, N = 3 SE +/- 3962.64, N = 5 353192.16 356244.91 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C Linux 6.13 Git INVLPGB Patched 30K 60K 90K 120K 150K SE +/- 1469.84, N = 3 SE +/- 279.05, N = 3 154232.79 155286.47 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Linux 6.13 Git INVLPGB Patched 1500 3000 4500 6000 7500 SE +/- 60.97, N = 3 SE +/- 68.35, N = 5 7001.42 6984.65 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Linux 6.13 Git INVLPGB Patched 30K 60K 90K 120K 150K SE +/- 2956.05, N = 15 SE +/- 1764.21, N = 3 145177.79 147724.66 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
NAMD OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms Linux 6.13 Git INVLPGB Patched 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 12.22 12.22
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C Linux 6.13 Git INVLPGB Patched 2K 4K 6K 8K 10K SE +/- 27.23, N = 3 SE +/- 98.82, N = 15 11180.11 11450.17 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Linux 6.13 Git Processor: AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads), Motherboard: Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS), Chipset: AMD 1Ah, Memory: 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.10, Kernel: 6.13.0-rc4-phx-stock (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / relatime,rw,stripe=64 / Block Size: 4096Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 25 December 2024 10:41 by user phoronix.
INVLPGB Patched Processor: AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads), Motherboard: Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS), Chipset: AMD 1Ah, Memory: 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.10, Kernel: 6.13.0-rc4-phx-broadcast-tlb (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / relatime,rw,stripe=64 / Block Size: 4096Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 24 December 2024 20:55 by user phoronix.