general

2 x AMD EPYC 9274F 24-Core testing with a ASUS ESC8000A-E12 K14PG-D24 (1201 BIOS) and ASUS NVIDIA H100 NVL 94GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2501312-NE-GENERAL1651&grw.

generalProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2 x AMD EPYC 9274F 24-Core @ 4.05GHz (48 Cores / 96 Threads)ASUS ESC8000A-E12 K14PG-D24 (1201 BIOS)AMD Device 14a41136GB240GB MR9540-8iASUS NVIDIA H100 NVL 94GB2 x Broadcom BCM57414 NetXtreme-E 10Gb/25GbUbuntu 22.046.8.0-51-generic (x86_64)X ServerNVIDIAOpenCL 3.0 CUDA 12.7.331.3.289GCC 11.4.0 + CUDA 12.6ext41920x1200OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - MQ-DEADLINE / relatime,rw,stripe=16 / Block Size: 4096- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa101148 - Python 3.10.12- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

generalcompilebench: Compilecompilebench: Initial Createcompilebench: Read Compiled Treestress-ng: Hashstress-ng: MMAPstress-ng: NUMAstress-ng: Pipestress-ng: Pollstress-ng: Zlibstress-ng: Futexstress-ng: MEMFDstress-ng: Mutexstress-ng: Atomicstress-ng: Cryptostress-ng: Mallocstress-ng: Cloningstress-ng: Forkingstress-ng: Pthreadstress-ng: AVL Treestress-ng: IO_uringstress-ng: SENDFILEstress-ng: CPU Cachestress-ng: CPU Stressstress-ng: Power Mathstress-ng: Semaphoresepoch: Conestress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: AVX-512 VNNIstress-ng: Integer Mathstress-ng: Function Callstress-ng: x86_64 RdRandstress-ng: Floating Pointstress-ng: Matrix 3D Mathstress-ng: Memory Copyingstress-ng: Vector Shufflestress-ng: Mixed Schedulerstress-ng: Socket Activitystress-ng: Exponential Mathstress-ng: Jpeg Compressionstress-ng: Logarithmic Mathstress-ng: Wide Vector Mathstress-ng: Context Switchingstress-ng: Fractal Generatorstress-ng: Radix String Sortstress-ng: Fused Multiply-Addstress-ng: Trigonometric Mathstress-ng: Bitonic Integer Sortstress-ng: Vector Floating Pointstress-ng: Bessel Math Operationsstress-ng: Integer Bit Operationsstress-ng: Glibc C String Functionsstress-ng: Glibc Qsort Data Sortingstress-ng: System V Message Passingstress-ng: POSIX Regular Expressionsstress-ng: Hyperbolic Trigonometric Mathcython-bench: N-Queensglibc-bench: cosglibc-bench: expglibc-bench: ffsglibc-bench: powglibc-bench: singlibc-bench: log2glibc-bench: modfglibc-bench: sinhglibc-bench: sqrtglibc-bench: tanhglibc-bench: asinhglibc-bench: atanhglibc-bench: ffsllglibc-bench: sincosglibc-bench: pthread_oncehpl: litert: DeepLab V3litert: SqueezeNetlitert: Inception V4litert: NASNet Mobilelitert: Mobilenet Floatlitert: Mobilenet Quantlitert: Inception ResNet V2litert: Quantized COCO SSD MobileNet v1mrbayes: Primate Phylogeny Analysisrbenchmark: numpy: ai-benchmark: Device Inference Scoreai-benchmark: Device Training Scoreai-benchmark: Device AI Scorepytorch: CPU - 1 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 16 - ResNet-50pytorch: CPU - 32 - ResNet-50pytorch: CPU - 64 - ResNet-50pytorch: CPU - 16 - ResNet-152pytorch: CPU - 256 - ResNet-50pytorch: CPU - 32 - ResNet-152pytorch: CPU - 512 - ResNet-50pytorch: CPU - 64 - ResNet-152pytorch: CPU - 256 - ResNet-152pytorch: CPU - 512 - ResNet-152pytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 16 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_lpytorch: CPU - 64 - Efficientnet_v2_lpytorch: CPU - 256 - Efficientnet_v2_lpytorch: CPU - 512 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 1 - ResNet-50pytorch: NVIDIA CUDA GPU - 1 - ResNet-152pytorch: NVIDIA CUDA GPU - 16 - ResNet-50pytorch: NVIDIA CUDA GPU - 32 - ResNet-50pytorch: NVIDIA CUDA GPU - 64 - ResNet-50pytorch: NVIDIA CUDA GPU - 16 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-50pytorch: NVIDIA CUDA GPU - 32 - ResNet-152pytorch: NVIDIA CUDA GPU - 512 - ResNet-50pytorch: NVIDIA CUDA GPU - 64 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-152pytorch: NVIDIA CUDA GPU - 512 - ResNet-152pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 16 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 64 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_lspacy: en_core_web_lgspacy: en_core_web_trftensorflow-lite: SqueezeNettensorflow-lite: Inception V4tensorflow-lite: NASNet Mobiletensorflow-lite: Mobilenet Floattensorflow-lite: Mobilenet Quanttensorflow-lite: Inception ResNet V2llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 128llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 256llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 512llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 1024llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 2048llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 256llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 512llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 1024llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 2048ncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetnpb: BT.Cnpb: CG.Cnpb: EP.Cnpb: EP.Dnpb: FT.Cnpb: IS.Dnpb: LU.Cnpb: MG.Cnpb: SP.Bnpb: SP.Conednn: IP Shapes 1D - CPUonednn: IP Shapes 3D - CPUonednn: Convolution Batch Shapes Auto - CPUonednn: Deconvolution Batch shapes_1d - CPUonednn: Deconvolution Batch shapes_3d - CPUonednn: Recurrent Neural Network Training - CPUonednn: Recurrent Neural Network Inference - CPUintel-mpi: IMB-P2P PingPongintel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 PingPongintel-mpi: IMB-MPI1 Sendrecvintel-mpi: IMB-MPI1 Sendrecvbuild-llvm: Ninjabuild-llvm: Unix Makefilesbuild-gcc: Time To Compilebuild-linux-kernel: defconfigbuild-linux-kernel: allmodconfigpybench: Total For Average Test TimesUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1613.38399.792657.4212298718.4312391.70400.8021174901.615524301.716888.222352726.223377.1514723637.91257.02402622178.28106614259.328039.7962039.40135182.401194.06174744.18916359.11771976.41145536.61119629.7661709090.58384.19310400.31407859.796540887.214586743.6545580.4520427852.8220022.5116067.5313628.1442786.9038187.3920499.66198051.566045.55374205.222734553.4415485954.92393.672158.4847688547.66151938.42672.51182971.0336898.149623549.8356864254.141590.9114211326.19444669.60293490.2617.97770.815815.13615.6703235.560763.122710.42156.6195722.95508.2457426.972622.583028.17575.6705538.98055.65547631.308793.875245.2738062.475527.33345.853255.7645507.65503.9181.5830.1446565.1631963380657634.9313.9729.7729.0329.0611.2729.0112.2930.0511.5111.4511.618.925.996.056.186.076.09287.46101.94279.17282.75279.19102.11282.30102.43283.02102.21102.81102.9851.5247.3246.7747.8047.3446.721533827515088.1437181.969069.73350.724098.4868557.917.5643.8743.8545.57147.6318.588352.5713019.7318283.7543.5442.8844.00151.8363.088390.8313052.6018308.61193.81212.92211.9384.983144.423150.783121.1441.9142.164096819257.78163843276856.8224.044096819224.124.9216384327685.004096819216384327681536307261441228823.3111.8514.3616.7611.5216.716.3922.6542.1212.696.3222.8223.3131.7523.9845.7357.2117.7923.1511.7713.3316.5611.6916.996.3522.2541.4112.466.5822.6323.1531.3023.7246.6756.6517.29176096.1650419.736692.957273.96102149.254064.08199460.27119325.63146367.34108596.640.8277420.3108890.6733698.665021.10328669.683398.494287445896073.65108.053739.533393.5979.47135.289198.897913.62030.299258.637738OpenBenchmarking.org

Compile Bench

Test: Compile

OpenBenchmarking.orgMB/s, More Is BetterCompile Bench 0.6Test: CompileUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30060090012001500SE +/- 11.78, N = 31613.38

Compile Bench

Test: Initial Create

OpenBenchmarking.orgMB/s, More Is BetterCompile Bench 0.6Test: Initial CreateUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core90180270360450SE +/- 4.73, N = 3399.79

Compile Bench

Test: Read Compiled Tree

OpenBenchmarking.orgMB/s, More Is BetterCompile Bench 0.6Test: Read Compiled TreeUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core6001200180024003000SE +/- 6.72, N = 32657.42

Stress-NG

Test: Hash

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: HashUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3M6M9M12M15MSE +/- 46080.65, N = 312298718.431. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: MMAP

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: MMAPUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3K6K9K12K15KSE +/- 89.55, N = 312391.701. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: NUMA

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: NUMAUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core90180270360450SE +/- 2.65, N = 3400.801. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Pipe

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: PipeUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core5M10M15M20M25MSE +/- 278876.82, N = 1221174901.611. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Poll

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: PollUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1.2M2.4M3.6M4.8M6MSE +/- 2718.46, N = 35524301.711. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Zlib

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: ZlibUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core15003000450060007500SE +/- 2.93, N = 36888.221. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Futex

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: FutexUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core500K1000K1500K2000K2500KSE +/- 20321.47, N = 32352726.221. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: MEMFD

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: MEMFDUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 3.40, N = 33377.151. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Mutex

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: MutexUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3M6M9M12M15MSE +/- 89617.88, N = 314723637.911. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Atomic

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: AtomicUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60120180240300SE +/- 0.39, N = 3257.021. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Crypto

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: CryptoUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core90M180M270M360M450MSE +/- 64894044.34, N = 15402622178.281. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Malloc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: MallocUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20M40M60M80M100MSE +/- 781395.83, N = 3106614259.321. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Cloning

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: CloningUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2K4K6K8K10KSE +/- 68.82, N = 38039.791. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Forking

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: ForkingUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core13K26K39K52K65KSE +/- 359.02, N = 362039.401. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: PthreadUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30K60K90K120K150KSE +/- 113.35, N = 3135182.401. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: AVL Tree

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: AVL TreeUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30060090012001500SE +/- 1.00, N = 31194.061. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: IO_uring

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: IO_uringUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core40K80K120K160K200KSE +/- 4811.73, N = 12174744.181. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: SENDFILE

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: SENDFILEUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core200K400K600K800K1000KSE +/- 380.91, N = 3916359.111. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: CPU Cache

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: CPU CacheUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core170K340K510K680K850KSE +/- 2865.93, N = 3771976.411. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: CPU Stress

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: CPU StressUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30K60K90K120K150KSE +/- 271.68, N = 3145536.611. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Power Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Power MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30K60K90K120K150KSE +/- 87.83, N = 3119629.761. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Semaphores

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: SemaphoresUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core13M26M39M52M65MSE +/- 563816.56, N = 361709090.581. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Epoch

Epoch3D Deck: Cone

OpenBenchmarking.orgSeconds, Fewer Is BetterEpoch 4.19.4Epoch3D Deck: ConeUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core80160240320400SE +/- 4.25, N = 3384.191. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Stress-NG

Test: Matrix Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Matrix MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core70K140K210K280K350KSE +/- 26.15, N = 3310400.311. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Vector Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Vector MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core90K180K270K360K450KSE +/- 2733.46, N = 3407859.791. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: AVX-512 VNNI

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: AVX-512 VNNIUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1.4M2.8M4.2M5.6M7MSE +/- 25442.23, N = 36540887.211. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Integer Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Integer MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1000K2000K3000K4000K5000KSE +/- 19349.76, N = 34586743.651. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Function Call

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Function CallUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core10K20K30K40K50KSE +/- 71.73, N = 345580.451. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: x86_64 RdRand

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: x86_64 RdRandUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4M8M12M16M20MSE +/- 68905.39, N = 320427852.821. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Floating PointUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4K8K12K16K20KSE +/- 76.48, N = 320022.511. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Matrix 3D Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Matrix 3D MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3K6K9K12K15KSE +/- 58.20, N = 316067.531. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Memory Copying

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Memory CopyingUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3K6K9K12K15KSE +/- 18.97, N = 313628.141. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Vector Shuffle

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Vector ShuffleUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core9K18K27K36K45KSE +/- 211.62, N = 342786.901. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Mixed Scheduler

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Mixed SchedulerUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core8K16K24K32K40KSE +/- 172.62, N = 338187.391. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Socket Activity

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Socket ActivityUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4K8K12K16K20KSE +/- 5.10, N = 320499.661. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Exponential Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Exponential MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core40K80K120K160K200KSE +/- 1213.20, N = 3198051.51. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Jpeg Compression

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Jpeg CompressionUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core14K28K42K56K70KSE +/- 141.87, N = 366045.551. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Logarithmic Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Logarithmic MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core80K160K240K320K400KSE +/- 588.05, N = 3374205.221. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Wide Vector Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Wide Vector MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core600K1200K1800K2400K3000KSE +/- 706.91, N = 32734553.441. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Context Switching

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Context SwitchingUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3M6M9M12M15MSE +/- 27120.62, N = 315485954.921. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Fractal Generator

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Fractal GeneratorUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core90180270360450SE +/- 0.18, N = 3393.671. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Radix String Sort

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Radix String SortUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core5001000150020002500SE +/- 14.11, N = 32158.481. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Fused Multiply-Add

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Fused Multiply-AddUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core10M20M30M40M50MSE +/- 16313.26, N = 347688547.661. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Trigonometric Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Trigonometric MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30K60K90K120K150KSE +/- 47.69, N = 3151938.421. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Bitonic Integer Sort

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Bitonic Integer SortUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core150300450600750SE +/- 0.35, N = 3672.511. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Vector Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Vector Floating PointUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core40K80K120K160K200KSE +/- 489.30, N = 3182971.031. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Bessel Math Operations

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Bessel Math OperationsUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core8K16K24K32K40KSE +/- 2.68, N = 336898.141. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Integer Bit Operations

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Integer Bit OperationsUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2M4M6M8M10MSE +/- 1390.58, N = 39623549.831. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Glibc C String Functions

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Glibc C String FunctionsUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core12M24M36M48M60MSE +/- 270620.44, N = 356864254.141. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Glibc Qsort Data Sorting

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Glibc Qsort Data SortingUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30060090012001500SE +/- 0.21, N = 31590.911. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: System V Message Passing

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: System V Message PassingUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3M6M9M12M15MSE +/- 27048.08, N = 314211326.191. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: POSIX Regular Expressions

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: POSIX Regular ExpressionsUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core100K200K300K400K500KSE +/- 185.23, N = 3444669.601. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Stress-NG

Test: Hyperbolic Trigonometric Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.18.09Test: Hyperbolic Trigonometric MathUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60K120K180K240K300KSE +/- 78.25, N = 3293490.261. (CXX) g++ options: -O2 -std=gnu99 -lc -lm

Cython Benchmark

Test: N-Queens

OpenBenchmarking.orgSeconds, Fewer Is BetterCython Benchmark 0.29.21Test: N-QueensUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.14, N = 317.98

Glibc Benchmarks

Benchmark: cos

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: cosUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1632486480SE +/- 0.01, N = 370.821. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: exp

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: expUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.00, N = 315.141. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: ffs

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: ffsUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1.27582.55163.82745.10326.379SE +/- 0.00021, N = 35.670321. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: pow

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: powUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core816243240SE +/- 0.13, N = 335.561. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: sin

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: sinUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1428425670SE +/- 0.00, N = 363.121. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: log2

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: log2Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.00, N = 310.421. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: modf

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: modfUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.00034, N = 36.619571. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: sinh

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: sinhUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core510152025SE +/- 0.02, N = 322.961. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: sqrt

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: sqrtUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.00536, N = 38.245741. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: tanh

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: tanhUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.00, N = 326.971. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: asinh

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: asinhUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core510152025SE +/- 0.00, N = 322.581. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: atanh

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: atanhUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.00, N = 328.181. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: ffsll

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: ffsllUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1.27592.55183.82775.10366.3795SE +/- 0.00011, N = 35.670551. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: sincos

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: sincosUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core918273645SE +/- 0.01, N = 338.981. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

Glibc Benchmarks

Benchmark: pthread_once

OpenBenchmarking.orgns, Fewer Is BetterGlibc Benchmarks 2.39Benchmark: pthread_onceUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1.27252.5453.81755.096.3625SE +/- 0.00045, N = 35.655471. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s

HPL Linpack

OpenBenchmarking.orgGFLOPS, More Is BetterHPL Linpack 2.3Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core140280420560700SE +/- 0.39, N = 3631.301. (CC) gcc options: -O2 -lopenblas -lm -lmpi

LiteRT

Model: DeepLab V3

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: DeepLab V3Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2K4K6K8K10KSE +/- 57.34, N = 158793.87

LiteRT

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: SqueezeNetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core11002200330044005500SE +/- 33.72, N = 35245.27

LiteRT

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception V4Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core8K16K24K32K40KSE +/- 455.18, N = 338062.4

LiteRT

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: NASNet MobileUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core16K32K48K64K80KSE +/- 699.04, N = 775527.3

LiteRT

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet FloatUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 39.32, N = 33345.85

LiteRT

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet QuantUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 51.59, N = 153255.76

LiteRT

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception ResNet V2Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core10K20K30K40K50KSE +/- 312.59, N = 1545507.6

LiteRT

Model: Quantized COCO SSD MobileNet v1

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Quantized COCO SSD MobileNet v1Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core12002400360048006000SE +/- 71.29, N = 135503.91

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 0.77, N = 381.581. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm -lreadline

R Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterR BenchmarkUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core0.03250.0650.09750.130.1625SE +/- 0.0014, N = 30.14461. R scripting front-end version 4.1.2 (2021-11-01)

Numpy Benchmark

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core120240360480600SE +/- 0.45, N = 3565.16

AI Benchmark Alpha

Device Inference Score

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Inference ScoreUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core70014002100280035003196

AI Benchmark Alpha

Device Training Score

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Training ScoreUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core70014002100280035003380

AI Benchmark Alpha

Device AI Score

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device AI ScoreUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core140028004200560070006576

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core816243240SE +/- 0.39, N = 1534.93MIN: 13.13 / MAX: 41.07

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.11, N = 313.97MIN: 8.7 / MAX: 15.19

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.40, N = 1529.77MIN: 13.43 / MAX: 34.1

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.21, N = 329.03MIN: 19.55 / MAX: 32.27

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.18, N = 329.06MIN: 16.5 / MAX: 32.6

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.11, N = 1211.27MIN: 6.86 / MAX: 12.45

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.32, N = 529.01MIN: 16.97 / MAX: 32.29

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.13, N = 312.29MIN: 7.38 / MAX: 12.68

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.44, N = 1230.05MIN: 17.64 / MAX: 35.18

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.12, N = 311.51MIN: 7.49 / MAX: 12.29

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.03, N = 311.45MIN: 10.75 / MAX: 12.11

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.09, N = 1211.61MIN: 6.77 / MAX: 12.65

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.13, N = 38.92MIN: 5.44 / MAX: 9.23

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1.34782.69564.04345.39126.739SE +/- 0.05, N = 35.99MIN: 4.56 / MAX: 6.36

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.04, N = 36.05MIN: 4.81 / MAX: 6.36

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.06, N = 36.18MIN: 4.86 / MAX: 6.54

PyTorch

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.06, N = 66.07MIN: 3.72 / MAX: 6.69

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.06, N = 36.09MIN: 1.78 / MAX: 6.47

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60120180240300SE +/- 3.16, N = 3287.46MIN: 131.87 / MAX: 296.46

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 1.00, N = 3101.94MIN: 72.8 / MAX: 104.88

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60120180240300SE +/- 2.06, N = 3279.17MIN: 168.32 / MAX: 287.01

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60120180240300SE +/- 2.42, N = 3282.75MIN: 166.76 / MAX: 292.39

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60120180240300SE +/- 1.76, N = 3279.19MIN: 167.62 / MAX: 285.62

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 0.30, N = 3102.11MIN: 74.46 / MAX: 105.04

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60120180240300SE +/- 3.61, N = 3282.30MIN: 166.48 / MAX: 290.65

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 1.05, N = 3102.43MIN: 75.28 / MAX: 104.92

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60120180240300SE +/- 3.04, N = 3283.02MIN: 168.78 / MAX: 291.41

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 0.84, N = 3102.21MIN: 73.94 / MAX: 104.89

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 0.62, N = 3102.81MIN: -2.4 / MAX: 105.21

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 0.47, N = 3102.98MIN: 73.53 / MAX: 107.1

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1224364860SE +/- 0.62, N = 451.52MIN: 36.53 / MAX: 55.15

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1122334455SE +/- 0.24, N = 347.32MIN: 36.64 / MAX: 48.53

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1122334455SE +/- 0.61, N = 346.77MIN: 36.59 / MAX: 48.4

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1122334455SE +/- 0.44, N = 347.80MIN: 2.23 / MAX: 49.81

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1122334455SE +/- 0.50, N = 347.34MIN: 39.26 / MAX: 50.11

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_lUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1122334455SE +/- 0.07, N = 346.72MIN: 39.24 / MAX: 48.63

spaCy

Model: en_core_web_lg

OpenBenchmarking.orgtokens/sec, More Is BetterspaCy 3.4.1Model: en_core_web_lgUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3K6K9K12K15KSE +/- 33.20, N = 315338

spaCy

Model: en_core_web_trf

OpenBenchmarking.orgtokens/sec, More Is BetterspaCy 3.4.1Model: en_core_web_trfUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core6001200180024003000SE +/- 96.13, N = 32751

TensorFlow Lite

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core11002200330044005500SE +/- 29.41, N = 35088.14

TensorFlow Lite

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V4Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core8K16K24K32K40KSE +/- 358.95, N = 337181.9

TensorFlow Lite

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: NASNet MobileUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core15K30K45K60K75KSE +/- 562.50, N = 369069.7

TensorFlow Lite

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet FloatUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 26.31, N = 153350.72

TensorFlow Lite

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet QuantUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core9001800270036004500SE +/- 12.05, N = 34098.48

TensorFlow Lite

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V2Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core15K30K45K60K75KSE +/- 1796.42, N = 1568557.9

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.02, N = 317.561. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.40, N = 1543.871. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.75, N = 1243.851. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.99, N = 745.571. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core306090120150SE +/- 0.09, N = 3147.631. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core510152025SE +/- 0.03, N = 318.581. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2K4K6K8K10KSE +/- 0.80, N = 38352.571. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3K6K9K12K15KSE +/- 5.36, N = 313019.731. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4K8K12K16K20KSE +/- 0.46, N = 318283.751. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.69, N = 1243.541. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.38, N = 1242.881. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.84, N = 944.001. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core306090120150SE +/- 0.10, N = 3151.831. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1428425670SE +/- 0.35, N = 363.081. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2K4K6K8K10KSE +/- 0.48, N = 38390.831. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3K6K9K12K15KSE +/- 1.13, N = 313052.601. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4K8K12K16K20KSE +/- 1.84, N = 318308.611. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4080120160200SE +/- 6.45, N = 12193.811. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core50100150200250SE +/- 2.68, N = 15212.921. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core50100150200250SE +/- 2.80, N = 3211.931. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 0.01, N = 384.981. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 0.28, N = 33144.421. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 0.63, N = 33150.781. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 0.10, N = 33121.141. (CXX) g++ options: -O3

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.21, N = 341.91

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.30, N = 342.16

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core9001800270036004500SE +/- 0.00, N = 34096

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2K4K6K8K10KSE +/- 0.00, N = 38192

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1326395265SE +/- 0.61, N = 457.78

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4K8K12K16K20KSE +/- 0.00, N = 316384

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7K14K21K28K35KSE +/- 0.00, N = 332768

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1326395265SE +/- 0.59, N = 356.82

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.09, N = 324.04

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core9001800270036004500SE +/- 0.00, N = 34096

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2K4K6K8K10KSE +/- 0.00, N = 38192

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.04, N = 324.12

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1.1072.2143.3214.4285.535SE +/- 0.03, N = 34.92

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4K8K12K16K20KSE +/- 0.00, N = 316384

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7K14K21K28K35KSE +/- 0.00, N = 332768

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1.1252.253.3754.55.625SE +/- 0.02, N = 35.00

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core9001800270036004500SE +/- 0.00, N = 34096

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2K4K6K8K10KSE +/- 0.00, N = 38192

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4K8K12K16K20KSE +/- 0.00, N = 316384

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7K14K21K28K35KSE +/- 0.00, N = 332768

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30060090012001500SE +/- 0.00, N = 31536

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 0.00, N = 33072

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core13002600390052006500SE +/- 0.00, N = 36144

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3K6K9K12K15KSE +/- 0.00, N = 312288

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mobilenetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.42, N = 1223.31MIN: 20.72 / MAX: 49.061. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v2-v2 - Model: mobilenet-v2Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.07, N = 1211.85MIN: 11.19 / MAX: 16.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v3-v3 - Model: mobilenet-v3Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 1.23, N = 1214.36MIN: 12.51 / MAX: 14521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: shufflenet-v2Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.05, N = 1216.76MIN: 16.18 / MAX: 40.161. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mnasnetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.07, N = 1211.52MIN: 10.69 / MAX: 16.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: efficientnet-b0Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.14, N = 1216.71MIN: 14.94 / MAX: 30.141. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: blazefaceUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.05, N = 126.39MIN: 5.9 / MAX: 7.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: googlenetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core510152025SE +/- 0.26, N = 1222.65MIN: 20.21 / MAX: 53.241. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vgg16Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.44, N = 1242.12MIN: 38.09 / MAX: 61.811. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet18Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.15, N = 1212.69MIN: 11.43 / MAX: 28.061. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: alexnetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.07, N = 126.32MIN: 5.67 / MAX: 19.421. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core510152025SE +/- 0.25, N = 1222.82MIN: 21.27 / MAX: 32.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.42, N = 1223.31MIN: 20.72 / MAX: 49.061. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: yolov4-tinyUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.47, N = 1231.75MIN: 28.18 / MAX: 38.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: squeezenet_ssdUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.32, N = 1223.98MIN: 21.56 / MAX: 65.281. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: regnety_400mUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1020304050SE +/- 0.27, N = 1245.73MIN: -425.63 / MAX: 85.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vision_transformerUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1326395265SE +/- 0.39, N = 1257.21MIN: 52 / MAX: 568.961. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: FastestDetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.32, N = 1217.79MIN: 16.23 / MAX: 24.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mobilenetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.17, N = 323.15MIN: 22.82 / MAX: 28.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.40, N = 311.77MIN: 11.14 / MAX: 13.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.23, N = 313.33MIN: 12.65 / MAX: 18.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: shufflenet-v2Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.09, N = 316.56MIN: 16.28 / MAX: 17.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mnasnetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.22, N = 311.69MIN: 11.09 / MAX: 19.421. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: efficientnet-b0Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.15, N = 316.99MIN: 16.56 / MAX: 22.21. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: blazefaceUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.05, N = 36.35MIN: 6.16 / MAX: 11.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: googlenetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core510152025SE +/- 0.24, N = 322.25MIN: 21.82 / MAX: 28.731. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vgg16Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core918273645SE +/- 1.07, N = 341.41MIN: 39.19 / MAX: 48.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet18Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core3691215SE +/- 0.18, N = 312.46MIN: 11.95 / MAX: 13.131. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: alexnetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.16, N = 36.58MIN: 6.19 / MAX: 11.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet50Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core510152025SE +/- 0.47, N = 322.63MIN: 21.57 / MAX: 30.391. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.17, N = 323.15MIN: 22.82 / MAX: 28.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: yolov4-tinyUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.69, N = 331.30MIN: 29.76 / MAX: 37.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: squeezenet_ssdUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core612182430SE +/- 0.30, N = 323.72MIN: 23.06 / MAX: 29.241. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: regnety_400mUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1122334455SE +/- 0.91, N = 346.67MIN: 45.2 / MAX: 88.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vision_transformerUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core1326395265SE +/- 0.48, N = 356.65MIN: 54.43 / MAX: 62.811. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: FastestDetUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core48121620SE +/- 0.64, N = 317.29MIN: 15.96 / MAX: 23.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NAS Parallel Benchmarks

Test / Class: BT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.CUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core40K80K120K160K200KSE +/- 248.09, N = 3176096.161. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: CG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.CUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core11K22K33K44K55KSE +/- 396.73, N = 1550419.731. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: EP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.CUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core14002800420056007000SE +/- 96.53, N = 156692.951. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: EP.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.DUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core16003200480064008000SE +/- 66.24, N = 77273.961. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: FT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.CUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20K40K60K80K100KSE +/- 426.91, N = 3102149.251. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: IS.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.DUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core9001800270036004500SE +/- 15.41, N = 34064.081. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.CUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core40K80K120K160K200KSE +/- 202.80, N = 3199460.271. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: MG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.CUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30K60K90K120K150KSE +/- 947.46, N = 3119325.631. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: SP.B

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.BUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core30K60K90K120K150KSE +/- 1283.80, N = 8146367.341. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: SP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20K40K60K80K100KSE +/- 153.92, N = 3108596.641. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

oneDNN

Harness: IP Shapes 1D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core0.18620.37240.55860.74480.931SE +/- 0.005050, N = 30.827742MIN: 0.771. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: IP Shapes 3D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core0.070.140.210.280.35SE +/- 0.002020, N = 30.310889MIN: 0.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core0.15150.3030.45450.6060.7575SE +/- 0.001462, N = 30.673369MIN: 0.641. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core246810SE +/- 0.03403, N = 38.66502MIN: 7.331. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core0.24820.49640.74460.99281.241SE +/- 0.00100, N = 31.10328MIN: 1.051. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core140280420560700SE +/- 1.20, N = 3669.68MIN: 653.951. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core90180270360450SE +/- 0.88, N = 3398.49MIN: 390.221. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

Intel MPI Benchmarks

Test: IMB-P2P PingPong

OpenBenchmarking.orgAverage Msg/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-P2P PingPongUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core6M12M18M24M30MSE +/- 68029.72, N = 328744589MIN: 15552 / MAX: 713885661. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Exchange

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 ExchangeUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core13002600390052006500SE +/- 18.92, N = 36073.65MAX: 26527.081. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Exchange

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 ExchangeUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 3.64, N = 3108.05MIN: 1.13 / MAX: 1949.291. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 PingPong

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 PingPongUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core8001600240032004000SE +/- 61.28, N = 153739.53MIN: 3.8 / MAX: 14414.31. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Sendrecv

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 SendrecvUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core7001400210028003500SE +/- 22.79, N = 33393.59MAX: 13584.141. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Sendrecv

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 SendrecvUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core20406080100SE +/- 0.66, N = 379.47MIN: 0.64 / MAX: 1174.191. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Timed LLVM Compilation

Build System: Ninja

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core306090120150SE +/- 1.19, N = 3135.29

Timed LLVM Compilation

Build System: Unix Makefiles

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core4080120160200SE +/- 1.10, N = 3198.90

Timed GCC Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed GCC Compilation 13.2Time To CompileUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core2004006008001000SE +/- 0.79, N = 3913.62

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: defconfigUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core714212835SE +/- 0.22, N = 1430.30

Timed Linux Kernel Compilation

Build: allmodconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: allmodconfigUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core60120180240300SE +/- 0.84, N = 3258.64

PyBench

Total For Average Test Times

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyBench 2018-02-16Total For Average Test TimesUbuntu 22.04 - 2 x AMD EPYC 9274F 24-Core160320480640800SE +/- 3.06, N = 3738


Phoronix Test Suite v10.8.5