new sun

AMD EPYC 9655P 96-Core testing with a Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) and ASPEED on Ubuntu 24.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2411245-NE-NEWSUN42126&gru&rdt.

new sunProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionabcAMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads)Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS)AMD 1Ah12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF3201GB Micron_7450_MTFDKCB3T2TFSASPEED2 x Broadcom NetXtreme BCM5720 PCIeUbuntu 24.106.12.0-rc7-linux-pm-next-phx (x86_64)GNOME Shell 47.0X ServerGCC 14.2.0ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 Java Details- OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

new sunllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048renaissance: Scala Dottyrenaissance: Rand Forestrenaissance: ALS Movie Lensrenaissance: Apache Spark Bayesrenaissance: Savina Reactors.IOrenaissance: Apache Spark PageRankrenaissance: Finagle HTTP Requestsrenaissance: Gaussian Mixture Modelrenaissance: In-Memory Database Shootoutrenaissance: Akka Unbalanced Cobwebbed Treerenaissance: Genetic Algorithm Using Jenetics + Futuresprimesieve: 1e12primesieve: 1e13blender: BMW27 - CPU-Onlyblender: Junkshop - CPU-Onlyblender: Classroom - CPU-Onlyblender: Fishy Cat - CPU-Onlyblender: Barbershop - CPU-Onlyblender: Pabellon Barcelona - CPU-Onlyabc45.9174.3492.46147.2847.727297.62141.694.55151.71202.95286.32723.7672.121012.4299.29997.92299.32859.75479.310422.617556.21417.21.91223.71213.6717.535.9119.08125.6942.2246.0772.5695.50141.7347.8474.0898.93141.3392.09149.06204.44297.04694.0705.020956.9310.610225.32315.42947.65433.19314.317357.91403.01.92423.69613.5917.3635.8019.14125.8642.1245.7973.9994.04136.9748.1673.8797.18133.7689.99154.18209.51290.8642.6687.221047.6332.010675.62294.92933.95503.210030.817006.51391.11.9123.69313.6817.3635.7319125.7242.24OpenBenchmarking.org

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128abc1020304050SE +/- 0.14, N = 345.9146.0745.791. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512abc20406080100SE +/- 0.74, N = 674.3472.5673.991. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024abc20406080100SE +/- 0.84, N = 792.4695.5094.041. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048abc306090120150SE +/- 1.47, N = 5147.28141.73136.971. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128abc1122334455SE +/- 0.04, N = 347.7247.8448.161. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512abc1632486480SE +/- 0.70, N = 372.0074.0873.871. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024abc20406080100SE +/- 1.24, N = 397.6298.9397.181. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048abc306090120150SE +/- 1.23, N = 15141.60141.33133.761. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128abc20406080100SE +/- 0.57, N = 394.5592.0989.991. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512abc306090120150SE +/- 2.71, N = 15151.71149.06154.181. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024abc50100150200250SE +/- 2.54, N = 3202.95204.44209.511. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048abc60120180240300SE +/- 2.33, N = 10286.32297.04290.801. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Renaissance

Test: Scala Dotty

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Scala Dottyabc160320480640800SE +/- 11.19, N = 15723.7694.0642.6MIN: 568.06 / MAX: 947.61MIN: 546.15 / MAX: 1018.83MIN: 577.67 / MAX: 838.49

Renaissance

Test: Random Forest

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Random Forestabc150300450600750SE +/- 10.59, N = 15672.1705.0687.2MIN: 528.68 / MAX: 784.21MIN: 549.24 / MAX: 883.64MIN: 619.72 / MAX: 816.51

Renaissance

Test: ALS Movie Lens

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: ALS Movie Lensabc5K10K15K20K25KSE +/- 23.68, N = 321012.420956.921047.6MIN: 20607.74 / MAX: 21012.43MIN: 20415.87 / MAX: 20987.27MIN: 20587.08 / MAX: 21047.64

Renaissance

Test: Apache Spark Bayes

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Apache Spark Bayesabc70140210280350SE +/- 2.61, N = 15299.2310.6332.0MIN: 265.15 / MAX: 417.4MIN: 247.11 / MAX: 495.6MIN: 276.08 / MAX: 457.42

Renaissance

Test: Savina Reactors.IO

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Savina Reactors.IOabc2K4K6K8K10KSE +/- 142.48, N = 39997.910225.310675.6MIN: 9507.2 / MAX: 10412.82MIN: 8218.9 / MAX: 10824.18MIN: 9337.39 / MAX: 11311.47

Renaissance

Test: Apache Spark PageRank

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Apache Spark PageRankabc5001000150020002500SE +/- 13.05, N = 32299.32315.42294.9MIN: 2027.84MIN: 2015.69 / MAX: 2330.45MIN: 1998.18 / MAX: 2294.94

Renaissance

Test: Finagle HTTP Requests

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Finagle HTTP Requestsabc6001200180024003000SE +/- 34.69, N = 32859.72947.62933.9MIN: 2819.14 / MAX: 3047.95MIN: 2887.93 / MAX: 3135.87MIN: 2923.69 / MAX: 3084.54

Renaissance

Test: Gaussian Mixture Model

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Gaussian Mixture Modelabc12002400360048006000SE +/- 16.67, N = 35479.35433.15503.2MIN: 5453.52 / MAX: 5882.54MIN: 5369.88 / MAX: 5913.36MIN: 5397.36 / MAX: 6278.06

Renaissance

Test: In-Memory Database Shootout

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: In-Memory Database Shootoutabc2K4K6K8K10KSE +/- 56.20, N = 310422.69314.310030.8MIN: 9582.91 / MAX: 11772.47MIN: 8410.14 / MAX: 11497.43MIN: 9132.4 / MAX: 10593.67

Renaissance

Test: Akka Unbalanced Cobwebbed Tree

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Akka Unbalanced Cobwebbed Treeabc4K8K12K16K20KSE +/- 47.28, N = 317556.217357.917006.5MIN: 17133.42 / MAX: 17556.24MIN: 16886.01 / MAX: 17436.54MIN: 16726.11 / MAX: 17006.52

Renaissance

Test: Genetic Algorithm Using Jenetics + Futures

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Genetic Algorithm Using Jenetics + Futuresabc30060090012001500SE +/- 2.41, N = 31417.21403.01391.1MIN: 1400.59 / MAX: 1466.56MIN: 1354.45 / MAX: 1433.65MIN: 1342.54 / MAX: 1457.18

Primesieve

Length: 1e12

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 12.6Length: 1e12abc0.43290.86581.29871.73162.1645SE +/- 0.006, N = 31.9121.9241.9101. (CXX) g++ options: -O3

Primesieve

Length: 1e13

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 12.6Length: 1e13abc612182430SE +/- 0.01, N = 323.7123.7023.691. (CXX) g++ options: -O3

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: CPU-Onlyabc48121620SE +/- 0.05, N = 313.6713.5913.68

Blender

Blend File: Junkshop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: CPU-Onlyabc48121620SE +/- 0.02, N = 317.5017.3617.36

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: CPU-Onlyabc816243240SE +/- 0.03, N = 335.9135.8035.73

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: CPU-Onlyabc510152025SE +/- 0.09, N = 319.0819.1419.00

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Barbershop - Compute: CPU-Onlyabc306090120150SE +/- 0.17, N = 3125.69125.86125.72

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: CPU-Onlyabc1020304050SE +/- 0.17, N = 342.2242.1242.24


Phoronix Test Suite v10.8.5