general 2 x AMD EPYC 9274F 24-Core testing with a ASUS ESC8000A-E12 K14PG-D24 (1201 BIOS) and ASUS NVIDIA H100 NVL 94GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2501312-NE-GENERAL1651&grs .
general Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Display Server Display Driver OpenCL Vulkan Compiler File-System Screen Resolution Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 x AMD EPYC 9274F 24-Core @ 4.05GHz (48 Cores / 96 Threads) ASUS ESC8000A-E12 K14PG-D24 (1201 BIOS) AMD Device 14a4 1136GB 240GB MR9540-8i ASUS NVIDIA H100 NVL 94GB 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ubuntu 22.04 6.8.0-51-generic (x86_64) X Server NVIDIA OpenCL 3.0 CUDA 12.7.33 1.3.289 GCC 11.4.0 + CUDA 12.6 ext4 1920x1200 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - MQ-DEADLINE / relatime,rw,stripe=16 / Block Size: 4096 - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa101148 - Python 3.10.12 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
general llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 2048 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 1024 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 512 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 256 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 2048 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 1024 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 512 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 256 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 128 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256 llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128 llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16 llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 ai-benchmark: Device AI Score ai-benchmark: Device Training Score ai-benchmark: Device Inference Score pybench: Total For Average Test Times ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v2-v2 - mobilenet-v2 spacy: en_core_web_lg stress-ng: Hyperbolic Trigonometric Math stress-ng: POSIX Regular Expressions stress-ng: System V Message Passing stress-ng: Glibc Qsort Data Sorting stress-ng: Glibc C String Functions stress-ng: Integer Bit Operations stress-ng: Bessel Math Operations stress-ng: Vector Floating Point stress-ng: Bitonic Integer Sort stress-ng: Trigonometric Math stress-ng: Fused Multiply-Add stress-ng: Radix String Sort stress-ng: Fractal Generator stress-ng: Context Switching stress-ng: Wide Vector Math stress-ng: Logarithmic Math stress-ng: Jpeg Compression stress-ng: Exponential Math stress-ng: Socket Activity stress-ng: Mixed Scheduler stress-ng: Vector Shuffle stress-ng: Memory Copying stress-ng: Matrix 3D Math stress-ng: Floating Point stress-ng: x86_64 RdRand stress-ng: Function Call stress-ng: Integer Math stress-ng: AVX-512 VNNI stress-ng: Vector Math stress-ng: Matrix Math stress-ng: Semaphores stress-ng: Power Math stress-ng: CPU Stress stress-ng: CPU Cache stress-ng: SENDFILE stress-ng: AVL Tree stress-ng: Pthread stress-ng: Forking stress-ng: Cloning stress-ng: Malloc stress-ng: Atomic stress-ng: Mutex stress-ng: MEMFD stress-ng: Futex stress-ng: Zlib stress-ng: Poll stress-ng: Pipe stress-ng: NUMA stress-ng: MMAP stress-ng: Hash pytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 64 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 16 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 512 - ResNet-152 pytorch: NVIDIA CUDA GPU - 256 - ResNet-152 pytorch: NVIDIA CUDA GPU - 64 - ResNet-152 pytorch: NVIDIA CUDA GPU - 512 - ResNet-50 pytorch: NVIDIA CUDA GPU - 32 - ResNet-152 pytorch: NVIDIA CUDA GPU - 256 - ResNet-50 pytorch: NVIDIA CUDA GPU - 16 - ResNet-152 pytorch: NVIDIA CUDA GPU - 64 - ResNet-50 pytorch: NVIDIA CUDA GPU - 32 - ResNet-50 pytorch: NVIDIA CUDA GPU - 16 - ResNet-50 pytorch: NVIDIA CUDA GPU - 1 - ResNet-152 pytorch: NVIDIA CUDA GPU - 1 - ResNet-50 pytorch: CPU - 512 - Efficientnet_v2_l pytorch: CPU - 256 - Efficientnet_v2_l pytorch: CPU - 64 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 1 - Efficientnet_v2_l pytorch: CPU - 512 - ResNet-152 pytorch: CPU - 256 - ResNet-152 pytorch: CPU - 64 - ResNet-152 pytorch: CPU - 512 - ResNet-50 pytorch: CPU - 32 - ResNet-152 pytorch: CPU - 256 - ResNet-50 pytorch: CPU - 16 - ResNet-152 pytorch: CPU - 64 - ResNet-50 pytorch: CPU - 32 - ResNet-50 pytorch: CPU - 16 - ResNet-50 pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 1 - ResNet-50 tensorflow-lite: Mobilenet Quant tensorflow-lite: Mobilenet Float tensorflow-lite: NASNet Mobile tensorflow-lite: Inception V4 tensorflow-lite: SqueezeNet litert: Quantized COCO SSD MobileNet v1 litert: Inception ResNet V2 litert: Mobilenet Float litert: NASNet Mobile litert: Inception V4 litert: SqueezeNet litert: DeepLab V3 intel-mpi: IMB-MPI1 Sendrecv intel-mpi: IMB-MPI1 Sendrecv intel-mpi: IMB-MPI1 Exchange intel-mpi: IMB-MPI1 Exchange intel-mpi: IMB-P2P PingPong rbenchmark: cython-bench: N-Queens numpy: onednn: Recurrent Neural Network Inference - CPU onednn: Recurrent Neural Network Training - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: IP Shapes 3D - CPU onednn: IP Shapes 1D - CPU build-llvm: Unix Makefiles build-llvm: Ninja build-linux-kernel: allmodconfig build-linux-kernel: defconfig build-gcc: Time To Compile epoch: Cone mrbayes: Primate Phylogeny Analysis npb: SP.C npb: SP.B npb: MG.C npb: LU.C npb: IS.D npb: FT.C npb: EP.D npb: EP.C npb: CG.C npb: BT.C hpl: glibc-bench: pthread_once glibc-bench: sincos glibc-bench: ffsll glibc-bench: atanh glibc-bench: asinh glibc-bench: tanh glibc-bench: sqrt glibc-bench: sinh glibc-bench: modf glibc-bench: log2 glibc-bench: sin glibc-bench: pow glibc-bench: ffs glibc-bench: exp glibc-bench: cos compilebench: Read Compiled Tree compilebench: Initial Create compilebench: Compile llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 ncnn: Vulkan GPU - FastestDet ncnn: CPU - FastestDet ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - mobilenet spacy: en_core_web_trf stress-ng: IO_uring stress-ng: Crypto tensorflow-lite: Inception ResNet V2 litert: Mobilenet Quant intel-mpi: IMB-MPI1 PingPong Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 12288 6144 3072 1536 32768 16384 8192 4096 5.00 32768 16384 4.92 24.12 8192 4096 24.04 56.82 32768 16384 57.78 8192 4096 42.16 41.91 3121.14 3150.78 3144.42 84.98 211.93 212.92 18308.61 13052.60 8390.83 63.08 151.83 44.00 42.88 43.54 18283.75 13019.73 8352.57 18.58 147.63 45.57 43.85 43.87 17.56 6576 3380 3196 738 56.65 46.67 23.72 31.30 23.15 22.63 6.58 12.46 41.41 22.25 6.35 16.99 11.69 16.56 13.33 11.77 23.15 57.21 45.73 23.98 31.75 22.82 6.32 12.69 42.12 22.65 6.39 16.71 11.52 16.76 11.85 15338 293490.26 444669.60 14211326.19 1590.91 56864254.14 9623549.83 36898.14 182971.03 672.51 151938.42 47688547.66 2158.48 393.67 15485954.92 2734553.44 374205.22 66045.55 198051.5 20499.66 38187.39 42786.90 13628.14 16067.53 20022.51 20427852.82 45580.45 4586743.65 6540887.21 407859.79 310400.31 61709090.58 119629.76 145536.61 771976.41 916359.11 1194.06 135182.40 62039.40 8039.79 106614259.32 257.02 14723637.91 3377.15 2352726.22 6888.22 5524301.71 21174901.61 400.80 12391.70 12298718.43 46.72 47.34 47.80 46.77 47.32 51.52 102.98 102.81 102.21 283.02 102.43 282.30 102.11 279.19 282.75 279.17 101.94 287.46 6.09 6.07 6.18 6.05 5.99 8.92 11.61 11.45 11.51 30.05 12.29 29.01 11.27 29.06 29.03 29.77 13.97 34.93 4098.48 3350.72 69069.7 37181.9 5088.14 5503.91 45507.6 3345.85 75527.3 38062.4 5245.27 8793.87 79.47 3393.59 108.05 6073.65 28744589 0.1446 17.977 565.16 398.494 669.683 1.10328 8.66502 0.673369 0.310889 0.827742 198.897 135.289 258.637 30.299 913.620 384.19 81.583 108596.64 146367.34 119325.63 199460.27 4064.08 102149.25 7273.96 6692.95 50419.73 176096.16 631.30 5.65547 38.9805 5.67055 28.1757 22.5830 26.9726 8.24574 22.9550 6.61957 10.4215 63.1227 35.5607 5.67032 15.1361 70.8158 2657.42 399.79 1613.38 193.81 17.29 17.79 23.31 14.36 23.31 2751 174744.18 402622178.28 68557.9 3255.76 3739.53 OpenBenchmarking.org
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 0.00, N = 3 12288
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1300 2600 3900 5200 6500 SE +/- 0.00, N = 3 6144
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 0.00, N = 3 3072
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 300 600 900 1200 1500 SE +/- 0.00, N = 3 1536
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 32768
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 0.00, N = 3 16384
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.00, N = 3 8192
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 0.00, N = 3 4096
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.125 2.25 3.375 4.5 5.625 SE +/- 0.02, N = 3 5.00
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 32768
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 0.00, N = 3 16384
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.107 2.214 3.321 4.428 5.535 SE +/- 0.03, N = 3 4.92
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.04, N = 3 24.12
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.00, N = 3 8192
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 0.00, N = 3 4096
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.09, N = 3 24.04
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13 26 39 52 65 SE +/- 0.59, N = 3 56.82
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 32768
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 0.00, N = 3 16384
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13 26 39 52 65 SE +/- 0.61, N = 4 57.78
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.00, N = 3 8192
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 0.00, N = 3 4096
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.30, N = 3 42.16
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.21, N = 3 41.91
Llama.cpp Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 0.10, N = 3 3121.14 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 0.63, N = 3 3150.78 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 0.28, N = 3 3144.42 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.01, N = 3 84.98 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 50 100 150 200 250 SE +/- 2.80, N = 3 211.93 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 50 100 150 200 250 SE +/- 2.68, N = 15 212.92 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 1.84, N = 3 18308.61 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 1.13, N = 3 13052.60 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.48, N = 3 8390.83 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 14 28 42 56 70 SE +/- 0.35, N = 3 63.08 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30 60 90 120 150 SE +/- 0.10, N = 3 151.83 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.84, N = 9 44.00 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.38, N = 12 42.88 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.69, N = 12 43.54 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 0.46, N = 3 18283.75 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 5.36, N = 3 13019.73 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.80, N = 3 8352.57 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.03, N = 3 18.58 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30 60 90 120 150 SE +/- 0.09, N = 3 147.63 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.99, N = 7 45.57 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.75, N = 12 43.85 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.40, N = 15 43.87 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.02, N = 3 17.56 1. (CXX) g++ options: -O3
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1400 2800 4200 5600 7000 6576
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 3380
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 3196
PyBench Total For Average Test Times OpenBenchmarking.org Milliseconds, Fewer Is Better PyBench 2018-02-16 Total For Average Test Times Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 160 320 480 640 800 SE +/- 3.06, N = 3 738
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13 26 39 52 65 SE +/- 0.48, N = 3 56.65 MIN: 54.43 / MAX: 62.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.91, N = 3 46.67 MIN: 45.2 / MAX: 88.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.30, N = 3 23.72 MIN: 23.06 / MAX: 29.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.69, N = 3 31.30 MIN: 29.76 / MAX: 37.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.17, N = 3 23.15 MIN: 22.82 / MAX: 28.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.47, N = 3 22.63 MIN: 21.57 / MAX: 30.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: alexnet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.16, N = 3 6.58 MIN: 6.19 / MAX: 11.02 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet18 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.18, N = 3 12.46 MIN: 11.95 / MAX: 13.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vgg16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 9 18 27 36 45 SE +/- 1.07, N = 3 41.41 MIN: 39.19 / MAX: 48.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: googlenet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.24, N = 3 22.25 MIN: 21.82 / MAX: 28.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: blazeface Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.05, N = 3 6.35 MIN: 6.16 / MAX: 11.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.15, N = 3 16.99 MIN: 16.56 / MAX: 22.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mnasnet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.22, N = 3 11.69 MIN: 11.09 / MAX: 19.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.09, N = 3 16.56 MIN: 16.28 / MAX: 17.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.23, N = 3 13.33 MIN: 12.65 / MAX: 18.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.40, N = 3 11.77 MIN: 11.14 / MAX: 13.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mobilenet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.17, N = 3 23.15 MIN: 22.82 / MAX: 28.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vision_transformer Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13 26 39 52 65 SE +/- 0.39, N = 12 57.21 MIN: 52 / MAX: 568.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: regnety_400m Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.27, N = 12 45.73 MIN: -425.63 / MAX: 85.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: squeezenet_ssd Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.32, N = 12 23.98 MIN: 21.56 / MAX: 65.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: yolov4-tiny Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.47, N = 12 31.75 MIN: 28.18 / MAX: 38.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.25, N = 12 22.82 MIN: 21.27 / MAX: 32.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: alexnet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.07, N = 12 6.32 MIN: 5.67 / MAX: 19.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet18 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.15, N = 12 12.69 MIN: 11.43 / MAX: 28.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vgg16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.44, N = 12 42.12 MIN: 38.09 / MAX: 61.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: googlenet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.26, N = 12 22.65 MIN: 20.21 / MAX: 53.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: blazeface Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.05, N = 12 6.39 MIN: 5.9 / MAX: 7.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: efficientnet-b0 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.14, N = 12 16.71 MIN: 14.94 / MAX: 30.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mnasnet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.07, N = 12 11.52 MIN: 10.69 / MAX: 16.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: shufflenet-v2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.05, N = 12 16.76 MIN: 16.18 / MAX: 40.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.07, N = 12 11.85 MIN: 11.19 / MAX: 16.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
spaCy Model: en_core_web_lg OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_lg Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 33.20, N = 3 15338
Stress-NG Test: Hyperbolic Trigonometric Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Hyperbolic Trigonometric Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60K 120K 180K 240K 300K SE +/- 78.25, N = 3 293490.26 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: POSIX Regular Expressions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: POSIX Regular Expressions Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 100K 200K 300K 400K 500K SE +/- 185.23, N = 3 444669.60 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: System V Message Passing Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3M 6M 9M 12M 15M SE +/- 27048.08, N = 3 14211326.19 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Glibc Qsort Data Sorting Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 300 600 900 1200 1500 SE +/- 0.21, N = 3 1590.91 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Glibc C String Functions Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 12M 24M 36M 48M 60M SE +/- 270620.44, N = 3 56864254.14 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Integer Bit Operations OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Integer Bit Operations Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2M 4M 6M 8M 10M SE +/- 1390.58, N = 3 9623549.83 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Bessel Math Operations OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Bessel Math Operations Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8K 16K 24K 32K 40K SE +/- 2.68, N = 3 36898.14 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Vector Floating Point Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 489.30, N = 3 182971.03 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Bitonic Integer Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Bitonic Integer Sort Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 150 300 450 600 750 SE +/- 0.35, N = 3 672.51 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Trigonometric Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Trigonometric Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 47.69, N = 3 151938.42 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Fused Multiply-Add Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10M 20M 30M 40M 50M SE +/- 16313.26, N = 3 47688547.66 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Radix String Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Radix String Sort Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 500 1000 1500 2000 2500 SE +/- 14.11, N = 3 2158.48 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Fractal Generator OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Fractal Generator Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90 180 270 360 450 SE +/- 0.18, N = 3 393.67 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Context Switching Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3M 6M 9M 12M 15M SE +/- 27120.62, N = 3 15485954.92 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Wide Vector Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 600K 1200K 1800K 2400K 3000K SE +/- 706.91, N = 3 2734553.44 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Logarithmic Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Logarithmic Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 80K 160K 240K 320K 400K SE +/- 588.05, N = 3 374205.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Jpeg Compression OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Jpeg Compression Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 14K 28K 42K 56K 70K SE +/- 141.87, N = 3 66045.55 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Exponential Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Exponential Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 1213.20, N = 3 198051.5 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Socket Activity Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 5.10, N = 3 20499.66 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Mixed Scheduler OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Mixed Scheduler Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8K 16K 24K 32K 40K SE +/- 172.62, N = 3 38187.39 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Vector Shuffle Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 9K 18K 27K 36K 45K SE +/- 211.62, N = 3 42786.90 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Memory Copying Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 18.97, N = 3 13628.14 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Matrix 3D Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 58.20, N = 3 16067.53 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Floating Point Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 76.48, N = 3 20022.51 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: x86_64 RdRand OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: x86_64 RdRand Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4M 8M 12M 16M 20M SE +/- 68905.39, N = 3 20427852.82 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Function Call Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10K 20K 30K 40K 50K SE +/- 71.73, N = 3 45580.45 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Integer Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Integer Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1000K 2000K 3000K 4000K 5000K SE +/- 19349.76, N = 3 4586743.65 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: AVX-512 VNNI OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: AVX-512 VNNI Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.4M 2.8M 4.2M 5.6M 7M SE +/- 25442.23, N = 3 6540887.21 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Vector Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90K 180K 270K 360K 450K SE +/- 2733.46, N = 3 407859.79 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Matrix Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 70K 140K 210K 280K 350K SE +/- 26.15, N = 3 310400.31 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Semaphores Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13M 26M 39M 52M 65M SE +/- 563816.56, N = 3 61709090.58 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Power Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Power Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 87.83, N = 3 119629.76 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: CPU Stress Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 271.68, N = 3 145536.61 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: CPU Cache Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 170K 340K 510K 680K 850K SE +/- 2865.93, N = 3 771976.41 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: SENDFILE Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 200K 400K 600K 800K 1000K SE +/- 380.91, N = 3 916359.11 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: AVL Tree Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 300 600 900 1200 1500 SE +/- 1.00, N = 3 1194.06 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Pthread Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 113.35, N = 3 135182.40 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Forking Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13K 26K 39K 52K 65K SE +/- 359.02, N = 3 62039.40 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Cloning Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 68.82, N = 3 8039.79 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Malloc Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20M 40M 60M 80M 100M SE +/- 781395.83, N = 3 106614259.32 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Atomic Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 0.39, N = 3 257.02 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Mutex Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3M 6M 9M 12M 15M SE +/- 89617.88, N = 3 14723637.91 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: MEMFD Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 3.40, N = 3 3377.15 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Futex Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 500K 1000K 1500K 2000K 2500K SE +/- 20321.47, N = 3 2352726.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Zlib Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1500 3000 4500 6000 7500 SE +/- 2.93, N = 3 6888.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Poll Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.2M 2.4M 3.6M 4.8M 6M SE +/- 2718.46, N = 3 5524301.71 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Pipe Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5M 10M 15M 20M 25M SE +/- 278876.82, N = 12 21174901.61 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: NUMA Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90 180 270 360 450 SE +/- 2.65, N = 3 400.80 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: MMAP Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 89.55, N = 3 12391.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Hash Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3M 6M 9M 12M 15M SE +/- 46080.65, N = 3 12298718.43 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.07, N = 3 46.72 MIN: 39.24 / MAX: 48.63
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.50, N = 3 47.34 MIN: 39.26 / MAX: 50.11
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.44, N = 3 47.80 MIN: 2.23 / MAX: 49.81
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.61, N = 3 46.77 MIN: 36.59 / MAX: 48.4
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.24, N = 3 47.32 MIN: 36.64 / MAX: 48.53
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 12 24 36 48 60 SE +/- 0.62, N = 4 51.52 MIN: 36.53 / MAX: 55.15
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.47, N = 3 102.98 MIN: 73.53 / MAX: 107.1
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.62, N = 3 102.81 MIN: -2.4 / MAX: 105.21
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.84, N = 3 102.21 MIN: 73.94 / MAX: 104.89
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 3.04, N = 3 283.02 MIN: 168.78 / MAX: 291.41
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 1.05, N = 3 102.43 MIN: 75.28 / MAX: 104.92
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 3.61, N = 3 282.30 MIN: 166.48 / MAX: 290.65
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.30, N = 3 102.11 MIN: 74.46 / MAX: 105.04
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 1.76, N = 3 279.19 MIN: 167.62 / MAX: 285.62
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 2.42, N = 3 282.75 MIN: 166.76 / MAX: 292.39
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 2.06, N = 3 279.17 MIN: 168.32 / MAX: 287.01
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 1.00, N = 3 101.94 MIN: 72.8 / MAX: 104.88
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 3.16, N = 3 287.46 MIN: 131.87 / MAX: 296.46
PyTorch Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.06, N = 3 6.09 MIN: 1.78 / MAX: 6.47
PyTorch Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.06, N = 6 6.07 MIN: 3.72 / MAX: 6.69
PyTorch Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.06, N = 3 6.18 MIN: 4.86 / MAX: 6.54
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.04, N = 3 6.05 MIN: 4.81 / MAX: 6.36
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.3478 2.6956 4.0434 5.3912 6.739 SE +/- 0.05, N = 3 5.99 MIN: 4.56 / MAX: 6.36
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.13, N = 3 8.92 MIN: 5.44 / MAX: 9.23
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.09, N = 12 11.61 MIN: 6.77 / MAX: 12.65
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.03, N = 3 11.45 MIN: 10.75 / MAX: 12.11
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.12, N = 3 11.51 MIN: 7.49 / MAX: 12.29
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.44, N = 12 30.05 MIN: 17.64 / MAX: 35.18
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.13, N = 3 12.29 MIN: 7.38 / MAX: 12.68
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.32, N = 5 29.01 MIN: 16.97 / MAX: 32.29
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.11, N = 12 11.27 MIN: 6.86 / MAX: 12.45
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.18, N = 3 29.06 MIN: 16.5 / MAX: 32.6
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.21, N = 3 29.03 MIN: 19.55 / MAX: 32.27
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.40, N = 15 29.77 MIN: 13.43 / MAX: 34.1
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.11, N = 3 13.97 MIN: 8.7 / MAX: 15.19
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8 16 24 32 40 SE +/- 0.39, N = 15 34.93 MIN: 13.13 / MAX: 41.07
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Quant Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 12.05, N = 3 4098.48
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Float Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 26.31, N = 15 3350.72
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: NASNet Mobile Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 15K 30K 45K 60K 75K SE +/- 562.50, N = 3 69069.7
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception V4 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8K 16K 24K 32K 40K SE +/- 358.95, N = 3 37181.9
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1100 2200 3300 4400 5500 SE +/- 29.41, N = 3 5088.14
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1200 2400 3600 4800 6000 SE +/- 71.29, N = 13 5503.91
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10K 20K 30K 40K 50K SE +/- 312.59, N = 15 45507.6
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 39.32, N = 3 3345.85
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 16K 32K 48K 64K 80K SE +/- 699.04, N = 7 75527.3
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8K 16K 24K 32K 40K SE +/- 455.18, N = 3 38062.4
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1100 2200 3300 4400 5500 SE +/- 33.72, N = 3 5245.27
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 57.34, N = 15 8793.87
Intel MPI Benchmarks Test: IMB-MPI1 Sendrecv OpenBenchmarking.org Average usec, Fewer Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Sendrecv Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.66, N = 3 79.47 MIN: 0.64 / MAX: 1174.19 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-MPI1 Sendrecv OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Sendrecv Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 22.79, N = 3 3393.59 MAX: 13584.14 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-MPI1 Exchange OpenBenchmarking.org Average usec, Fewer Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Exchange Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 3.64, N = 3 108.05 MIN: 1.13 / MAX: 1949.29 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-MPI1 Exchange OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Exchange Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1300 2600 3900 5200 6500 SE +/- 18.92, N = 3 6073.65 MAX: 26527.08 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-P2P PingPong OpenBenchmarking.org Average Msg/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-P2P PingPong Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6M 12M 18M 24M 30M SE +/- 68029.72, N = 3 28744589 MIN: 15552 / MAX: 71388566 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.0325 0.065 0.0975 0.13 0.1625 SE +/- 0.0014, N = 3 0.1446 1. R scripting front-end version 4.1.2 (2021-11-01)
Cython Benchmark Test: N-Queens OpenBenchmarking.org Seconds, Fewer Is Better Cython Benchmark 0.29.21 Test: N-Queens Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.14, N = 3 17.98
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 120 240 360 480 600 SE +/- 0.45, N = 3 565.16
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90 180 270 360 450 SE +/- 0.88, N = 3 398.49 MIN: 390.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 140 280 420 560 700 SE +/- 1.20, N = 3 669.68 MIN: 653.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.2482 0.4964 0.7446 0.9928 1.241 SE +/- 0.00100, N = 3 1.10328 MIN: 1.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.03403, N = 3 8.66502 MIN: 7.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.1515 0.303 0.4545 0.606 0.7575 SE +/- 0.001462, N = 3 0.673369 MIN: 0.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.07 0.14 0.21 0.28 0.35 SE +/- 0.002020, N = 3 0.310889 MIN: 0.28 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.1862 0.3724 0.5586 0.7448 0.931 SE +/- 0.005050, N = 3 0.827742 MIN: 0.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40 80 120 160 200 SE +/- 1.10, N = 3 198.90
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30 60 90 120 150 SE +/- 1.19, N = 3 135.29
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 0.84, N = 3 258.64
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.22, N = 14 30.30
Timed GCC Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed GCC Compilation 13.2 Time To Compile Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 200 400 600 800 1000 SE +/- 0.79, N = 3 913.62
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 80 160 240 320 400 SE +/- 4.25, N = 3 384.19 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Timed MrBayes Analysis Primate Phylogeny Analysis OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.77, N = 3 81.58 1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm -lreadline
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20K 40K 60K 80K 100K SE +/- 153.92, N = 3 108596.64 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: SP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 1283.80, N = 8 146367.34 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 947.46, N = 3 119325.63 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 202.80, N = 3 199460.27 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 15.41, N = 3 4064.08 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20K 40K 60K 80K 100K SE +/- 426.91, N = 3 102149.25 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1600 3200 4800 6400 8000 SE +/- 66.24, N = 7 7273.96 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1400 2800 4200 5600 7000 SE +/- 96.53, N = 15 6692.95 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11K 22K 33K 44K 55K SE +/- 396.73, N = 15 50419.73 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 248.09, N = 3 176096.16 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
HPL Linpack OpenBenchmarking.org GFLOPS, More Is Better HPL Linpack 2.3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 140 280 420 560 700 SE +/- 0.39, N = 3 631.30 1. (CC) gcc options: -O2 -lopenblas -lm -lmpi
Glibc Benchmarks Benchmark: pthread_once OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: pthread_once Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.2725 2.545 3.8175 5.09 6.3625 SE +/- 0.00045, N = 3 5.65547 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: sincos OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: sincos Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 9 18 27 36 45 SE +/- 0.01, N = 3 38.98 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: ffsll OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: ffsll Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.2759 2.5518 3.8277 5.1036 6.3795 SE +/- 0.00011, N = 3 5.67055 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: atanh OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: atanh Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.00, N = 3 28.18 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: asinh OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: asinh Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.00, N = 3 22.58 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: tanh OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: tanh Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.00, N = 3 26.97 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: sqrt OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: sqrt Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.00536, N = 3 8.24574 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: sinh OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: sinh Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.02, N = 3 22.96 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: modf OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: modf Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.00034, N = 3 6.61957 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: log2 OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: log2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.00, N = 3 10.42 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: sin OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: sin Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 14 28 42 56 70 SE +/- 0.00, N = 3 63.12 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: pow OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: pow Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8 16 24 32 40 SE +/- 0.13, N = 3 35.56 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: ffs OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: ffs Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.2758 2.5516 3.8274 5.1032 6.379 SE +/- 0.00021, N = 3 5.67032 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: exp OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: exp Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.00, N = 3 15.14 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: cos OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: cos Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 16 32 48 64 80 SE +/- 0.01, N = 3 70.82 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Compile Bench Test: Read Compiled Tree OpenBenchmarking.org MB/s, More Is Better Compile Bench 0.6 Test: Read Compiled Tree Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 600 1200 1800 2400 3000 SE +/- 6.72, N = 3 2657.42
Compile Bench Test: Initial Create OpenBenchmarking.org MB/s, More Is Better Compile Bench 0.6 Test: Initial Create Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90 180 270 360 450 SE +/- 4.73, N = 3 399.79
Compile Bench Test: Compile OpenBenchmarking.org MB/s, More Is Better Compile Bench 0.6 Test: Compile Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 300 600 900 1200 1500 SE +/- 11.78, N = 3 1613.38
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40 80 120 160 200 SE +/- 6.45, N = 12 193.81 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: FastestDet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.64, N = 3 17.29 MIN: 15.96 / MAX: 23.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: FastestDet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.32, N = 12 17.79 MIN: 16.23 / MAX: 24.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.42, N = 12 23.31 MIN: 20.72 / MAX: 49.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 1.23, N = 12 14.36 MIN: 12.51 / MAX: 1452 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mobilenet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.42, N = 12 23.31 MIN: 20.72 / MAX: 49.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
spaCy Model: en_core_web_trf OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_trf Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 600 1200 1800 2400 3000 SE +/- 96.13, N = 3 2751
Stress-NG Test: IO_uring OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: IO_uring Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 4811.73, N = 12 174744.18 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Crypto Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90M 180M 270M 360M 450M SE +/- 64894044.34, N = 15 402622178.28 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 15K 30K 45K 60K 75K SE +/- 1796.42, N = 15 68557.9
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 51.59, N = 15 3255.76
Intel MPI Benchmarks Test: IMB-MPI1 PingPong OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 PingPong Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 800 1600 2400 3200 4000 SE +/- 61.28, N = 15 3739.53 MIN: 3.8 / MAX: 14414.3 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Phoronix Test Suite v10.8.5