general 2 x AMD EPYC 9274F 24-Core testing with a ASUS ESC8000A-E12 K14PG-D24 (1201 BIOS) and ASUS NVIDIA H100 NVL 94GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2501312-NE-GENERAL1651&gru .
general Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Display Server Display Driver OpenCL Vulkan Compiler File-System Screen Resolution Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 x AMD EPYC 9274F 24-Core @ 4.05GHz (48 Cores / 96 Threads) ASUS ESC8000A-E12 K14PG-D24 (1201 BIOS) AMD Device 14a4 1136GB 240GB MR9540-8i ASUS NVIDIA H100 NVL 94GB 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ubuntu 22.04 6.8.0-51-generic (x86_64) X Server NVIDIA OpenCL 3.0 CUDA 12.7.33 1.3.289 GCC 11.4.0 + CUDA 12.6 ext4 1920x1200 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - MQ-DEADLINE / relatime,rw,stripe=16 / Block Size: 4096 - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa101148 - Python 3.10.12 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
general intel-mpi: IMB-MPI1 Exchange intel-mpi: IMB-MPI1 PingPong intel-mpi: IMB-MPI1 Sendrecv intel-mpi: IMB-P2P PingPong pytorch: CPU - 1 - ResNet-50 pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 16 - ResNet-50 pytorch: CPU - 32 - ResNet-50 pytorch: CPU - 64 - ResNet-50 pytorch: CPU - 16 - ResNet-152 pytorch: CPU - 256 - ResNet-50 pytorch: CPU - 32 - ResNet-152 pytorch: CPU - 512 - ResNet-50 pytorch: CPU - 64 - ResNet-152 pytorch: CPU - 256 - ResNet-152 pytorch: CPU - 512 - ResNet-152 pytorch: CPU - 1 - Efficientnet_v2_l pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 64 - Efficientnet_v2_l pytorch: CPU - 256 - Efficientnet_v2_l pytorch: CPU - 512 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 1 - ResNet-50 pytorch: NVIDIA CUDA GPU - 1 - ResNet-152 pytorch: NVIDIA CUDA GPU - 16 - ResNet-50 pytorch: NVIDIA CUDA GPU - 32 - ResNet-50 pytorch: NVIDIA CUDA GPU - 64 - ResNet-50 pytorch: NVIDIA CUDA GPU - 16 - ResNet-152 pytorch: NVIDIA CUDA GPU - 256 - ResNet-50 pytorch: NVIDIA CUDA GPU - 32 - ResNet-152 pytorch: NVIDIA CUDA GPU - 512 - ResNet-50 pytorch: NVIDIA CUDA GPU - 64 - ResNet-152 pytorch: NVIDIA CUDA GPU - 256 - ResNet-152 pytorch: NVIDIA CUDA GPU - 512 - ResNet-152 pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 16 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 64 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_l stress-ng: Hash stress-ng: MMAP stress-ng: NUMA stress-ng: Pipe stress-ng: Poll stress-ng: Zlib stress-ng: Futex stress-ng: MEMFD stress-ng: Mutex stress-ng: Atomic stress-ng: Crypto stress-ng: Malloc stress-ng: Cloning stress-ng: Forking stress-ng: Pthread stress-ng: AVL Tree stress-ng: IO_uring stress-ng: SENDFILE stress-ng: CPU Cache stress-ng: CPU Stress stress-ng: Power Math stress-ng: Semaphores stress-ng: Matrix Math stress-ng: Vector Math stress-ng: AVX-512 VNNI stress-ng: Integer Math stress-ng: Function Call stress-ng: x86_64 RdRand stress-ng: Floating Point stress-ng: Matrix 3D Math stress-ng: Memory Copying stress-ng: Vector Shuffle stress-ng: Mixed Scheduler stress-ng: Socket Activity stress-ng: Exponential Math stress-ng: Jpeg Compression stress-ng: Logarithmic Math stress-ng: Wide Vector Math stress-ng: Context Switching stress-ng: Fractal Generator stress-ng: Radix String Sort stress-ng: Fused Multiply-Add stress-ng: Trigonometric Math stress-ng: Bitonic Integer Sort stress-ng: Vector Floating Point stress-ng: Bessel Math Operations stress-ng: Integer Bit Operations stress-ng: Glibc C String Functions stress-ng: Glibc Qsort Data Sorting stress-ng: System V Message Passing stress-ng: POSIX Regular Expressions stress-ng: Hyperbolic Trigonometric Math hpl: compilebench: Compile compilebench: Initial Create compilebench: Read Compiled Tree numpy: ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16 llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 128 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 256 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 512 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 1024 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 2048 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 256 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 512 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 1024 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 2048 spacy: en_core_web_lg spacy: en_core_web_trf npb: BT.C npb: CG.C npb: EP.C npb: EP.D npb: FT.C npb: IS.D npb: LU.C npb: MG.C npb: SP.B npb: SP.C intel-mpi: IMB-MPI1 Exchange intel-mpi: IMB-MPI1 Sendrecv litert: DeepLab V3 litert: SqueezeNet litert: Inception V4 litert: NASNet Mobile litert: Mobilenet Float litert: Mobilenet Quant litert: Inception ResNet V2 litert: Quantized COCO SSD MobileNet v1 tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 pybench: Total For Average Test Times onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet glibc-bench: cos glibc-bench: exp glibc-bench: ffs glibc-bench: pow glibc-bench: sin glibc-bench: log2 glibc-bench: modf glibc-bench: sinh glibc-bench: sqrt glibc-bench: tanh glibc-bench: asinh glibc-bench: atanh glibc-bench: ffsll glibc-bench: sincos glibc-bench: pthread_once mrbayes: Primate Phylogeny Analysis epoch: Cone build-gcc: Time To Compile build-linux-kernel: defconfig build-linux-kernel: allmodconfig build-llvm: Ninja build-llvm: Unix Makefiles cython-bench: N-Queens rbenchmark: scikit-learn: Sparse Rand Projections / 100 Iterations Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6073.65 3739.53 3393.59 28744589 34.93 13.97 29.77 29.03 29.06 11.27 29.01 12.29 30.05 11.51 11.45 11.61 8.92 5.99 6.05 6.18 6.07 6.09 287.46 101.94 279.17 282.75 279.19 102.11 282.30 102.43 283.02 102.21 102.81 102.98 51.52 47.32 46.77 47.80 47.34 46.72 12298718.43 12391.70 400.80 21174901.61 5524301.71 6888.22 2352726.22 3377.15 14723637.91 257.02 402622178.28 106614259.32 8039.79 62039.40 135182.40 1194.06 174744.18 916359.11 771976.41 145536.61 119629.76 61709090.58 310400.31 407859.79 6540887.21 4586743.65 45580.45 20427852.82 20022.51 16067.53 13628.14 42786.90 38187.39 20499.66 198051.5 66045.55 374205.22 2734553.44 15485954.92 393.67 2158.48 47688547.66 151938.42 672.51 182971.03 36898.14 9623549.83 56864254.14 1590.91 14211326.19 444669.60 293490.26 631.30 1613.38 399.79 2657.42 565.16 3196 3380 6576 17.56 43.87 43.85 45.57 147.63 18.58 8352.57 13019.73 18283.75 43.54 42.88 44.00 151.83 63.08 8390.83 13052.60 18308.61 193.81 212.92 211.93 84.98 3144.42 3150.78 3121.14 41.91 42.16 4096 8192 57.78 16384 32768 56.82 24.04 4096 8192 24.12 4.92 16384 32768 5.00 4096 8192 16384 32768 1536 3072 6144 12288 15338 2751 176096.16 50419.73 6692.95 7273.96 102149.25 4064.08 199460.27 119325.63 146367.34 108596.64 108.05 79.47 8793.87 5245.27 38062.4 75527.3 3345.85 3255.76 45507.6 5503.91 5088.14 37181.9 69069.7 3350.72 4098.48 68557.9 738 0.827742 0.310889 0.673369 8.66502 1.10328 669.683 398.494 23.31 11.85 14.36 16.76 11.52 16.71 6.39 22.65 42.12 12.69 6.32 22.82 23.31 31.75 23.98 45.73 57.21 17.79 23.15 11.77 13.33 16.56 11.69 16.99 6.35 22.25 41.41 12.46 6.58 22.63 23.15 31.30 23.72 46.67 56.65 17.29 70.8158 15.1361 5.67032 35.5607 63.1227 10.4215 6.61957 22.9550 8.24574 26.9726 22.5830 28.1757 5.67055 38.9805 5.65547 81.583 384.19 913.620 30.299 258.637 135.289 198.897 17.977 0.1446 OpenBenchmarking.org
Intel MPI Benchmarks Test: IMB-MPI1 Exchange OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Exchange Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1300 2600 3900 5200 6500 SE +/- 18.92, N = 3 6073.65 MAX: 26527.08 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-MPI1 PingPong OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 PingPong Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 800 1600 2400 3200 4000 SE +/- 61.28, N = 15 3739.53 MIN: 3.8 / MAX: 14414.3 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-MPI1 Sendrecv OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Sendrecv Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 22.79, N = 3 3393.59 MAX: 13584.14 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-P2P PingPong OpenBenchmarking.org Average Msg/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-P2P PingPong Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6M 12M 18M 24M 30M SE +/- 68029.72, N = 3 28744589 MIN: 15552 / MAX: 71388566 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8 16 24 32 40 SE +/- 0.39, N = 15 34.93 MIN: 13.13 / MAX: 41.07
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.11, N = 3 13.97 MIN: 8.7 / MAX: 15.19
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.40, N = 15 29.77 MIN: 13.43 / MAX: 34.1
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.21, N = 3 29.03 MIN: 19.55 / MAX: 32.27
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.18, N = 3 29.06 MIN: 16.5 / MAX: 32.6
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.11, N = 12 11.27 MIN: 6.86 / MAX: 12.45
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.32, N = 5 29.01 MIN: 16.97 / MAX: 32.29
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.13, N = 3 12.29 MIN: 7.38 / MAX: 12.68
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.44, N = 12 30.05 MIN: 17.64 / MAX: 35.18
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.12, N = 3 11.51 MIN: 7.49 / MAX: 12.29
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.03, N = 3 11.45 MIN: 10.75 / MAX: 12.11
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.09, N = 12 11.61 MIN: 6.77 / MAX: 12.65
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.13, N = 3 8.92 MIN: 5.44 / MAX: 9.23
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.3478 2.6956 4.0434 5.3912 6.739 SE +/- 0.05, N = 3 5.99 MIN: 4.56 / MAX: 6.36
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.04, N = 3 6.05 MIN: 4.81 / MAX: 6.36
PyTorch Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.06, N = 3 6.18 MIN: 4.86 / MAX: 6.54
PyTorch Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.06, N = 6 6.07 MIN: 3.72 / MAX: 6.69
PyTorch Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.06, N = 3 6.09 MIN: 1.78 / MAX: 6.47
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 3.16, N = 3 287.46 MIN: 131.87 / MAX: 296.46
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 1.00, N = 3 101.94 MIN: 72.8 / MAX: 104.88
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 2.06, N = 3 279.17 MIN: 168.32 / MAX: 287.01
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 2.42, N = 3 282.75 MIN: 166.76 / MAX: 292.39
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 1.76, N = 3 279.19 MIN: 167.62 / MAX: 285.62
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.30, N = 3 102.11 MIN: 74.46 / MAX: 105.04
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 3.61, N = 3 282.30 MIN: 166.48 / MAX: 290.65
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 1.05, N = 3 102.43 MIN: 75.28 / MAX: 104.92
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 3.04, N = 3 283.02 MIN: 168.78 / MAX: 291.41
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.84, N = 3 102.21 MIN: 73.94 / MAX: 104.89
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.62, N = 3 102.81 MIN: -2.4 / MAX: 105.21
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.47, N = 3 102.98 MIN: 73.53 / MAX: 107.1
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 12 24 36 48 60 SE +/- 0.62, N = 4 51.52 MIN: 36.53 / MAX: 55.15
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.24, N = 3 47.32 MIN: 36.64 / MAX: 48.53
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.61, N = 3 46.77 MIN: 36.59 / MAX: 48.4
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.44, N = 3 47.80 MIN: 2.23 / MAX: 49.81
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.50, N = 3 47.34 MIN: 39.26 / MAX: 50.11
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.07, N = 3 46.72 MIN: 39.24 / MAX: 48.63
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Hash Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3M 6M 9M 12M 15M SE +/- 46080.65, N = 3 12298718.43 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: MMAP Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 89.55, N = 3 12391.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: NUMA Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90 180 270 360 450 SE +/- 2.65, N = 3 400.80 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Pipe Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5M 10M 15M 20M 25M SE +/- 278876.82, N = 12 21174901.61 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Poll Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.2M 2.4M 3.6M 4.8M 6M SE +/- 2718.46, N = 3 5524301.71 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Zlib Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1500 3000 4500 6000 7500 SE +/- 2.93, N = 3 6888.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Futex Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 500K 1000K 1500K 2000K 2500K SE +/- 20321.47, N = 3 2352726.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: MEMFD Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 3.40, N = 3 3377.15 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Mutex Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3M 6M 9M 12M 15M SE +/- 89617.88, N = 3 14723637.91 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Atomic Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 0.39, N = 3 257.02 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Crypto Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90M 180M 270M 360M 450M SE +/- 64894044.34, N = 15 402622178.28 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Malloc Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20M 40M 60M 80M 100M SE +/- 781395.83, N = 3 106614259.32 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Cloning Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 68.82, N = 3 8039.79 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Forking Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13K 26K 39K 52K 65K SE +/- 359.02, N = 3 62039.40 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Pthread Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 113.35, N = 3 135182.40 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: AVL Tree Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 300 600 900 1200 1500 SE +/- 1.00, N = 3 1194.06 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: IO_uring OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: IO_uring Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 4811.73, N = 12 174744.18 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: SENDFILE Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 200K 400K 600K 800K 1000K SE +/- 380.91, N = 3 916359.11 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: CPU Cache Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 170K 340K 510K 680K 850K SE +/- 2865.93, N = 3 771976.41 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: CPU Stress Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 271.68, N = 3 145536.61 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Power Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Power Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 87.83, N = 3 119629.76 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Semaphores Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13M 26M 39M 52M 65M SE +/- 563816.56, N = 3 61709090.58 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Matrix Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 70K 140K 210K 280K 350K SE +/- 26.15, N = 3 310400.31 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Vector Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90K 180K 270K 360K 450K SE +/- 2733.46, N = 3 407859.79 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: AVX-512 VNNI OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: AVX-512 VNNI Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.4M 2.8M 4.2M 5.6M 7M SE +/- 25442.23, N = 3 6540887.21 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Integer Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Integer Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1000K 2000K 3000K 4000K 5000K SE +/- 19349.76, N = 3 4586743.65 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Function Call Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10K 20K 30K 40K 50K SE +/- 71.73, N = 3 45580.45 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: x86_64 RdRand OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: x86_64 RdRand Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4M 8M 12M 16M 20M SE +/- 68905.39, N = 3 20427852.82 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Floating Point Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 76.48, N = 3 20022.51 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Matrix 3D Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 58.20, N = 3 16067.53 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Memory Copying Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 18.97, N = 3 13628.14 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Vector Shuffle Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 9K 18K 27K 36K 45K SE +/- 211.62, N = 3 42786.90 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Mixed Scheduler OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Mixed Scheduler Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8K 16K 24K 32K 40K SE +/- 172.62, N = 3 38187.39 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Socket Activity Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 5.10, N = 3 20499.66 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Exponential Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Exponential Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 1213.20, N = 3 198051.5 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Jpeg Compression OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Jpeg Compression Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 14K 28K 42K 56K 70K SE +/- 141.87, N = 3 66045.55 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Logarithmic Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Logarithmic Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 80K 160K 240K 320K 400K SE +/- 588.05, N = 3 374205.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Wide Vector Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 600K 1200K 1800K 2400K 3000K SE +/- 706.91, N = 3 2734553.44 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Context Switching Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3M 6M 9M 12M 15M SE +/- 27120.62, N = 3 15485954.92 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Fractal Generator OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Fractal Generator Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90 180 270 360 450 SE +/- 0.18, N = 3 393.67 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Radix String Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Radix String Sort Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 500 1000 1500 2000 2500 SE +/- 14.11, N = 3 2158.48 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Fused Multiply-Add Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10M 20M 30M 40M 50M SE +/- 16313.26, N = 3 47688547.66 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Trigonometric Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Trigonometric Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 47.69, N = 3 151938.42 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Bitonic Integer Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Bitonic Integer Sort Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 150 300 450 600 750 SE +/- 0.35, N = 3 672.51 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Vector Floating Point Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 489.30, N = 3 182971.03 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Bessel Math Operations OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Bessel Math Operations Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8K 16K 24K 32K 40K SE +/- 2.68, N = 3 36898.14 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Integer Bit Operations OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Integer Bit Operations Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2M 4M 6M 8M 10M SE +/- 1390.58, N = 3 9623549.83 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Glibc C String Functions Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 12M 24M 36M 48M 60M SE +/- 270620.44, N = 3 56864254.14 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Glibc Qsort Data Sorting Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 300 600 900 1200 1500 SE +/- 0.21, N = 3 1590.91 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: System V Message Passing Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3M 6M 9M 12M 15M SE +/- 27048.08, N = 3 14211326.19 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: POSIX Regular Expressions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: POSIX Regular Expressions Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 100K 200K 300K 400K 500K SE +/- 185.23, N = 3 444669.60 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Hyperbolic Trigonometric Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.18.09 Test: Hyperbolic Trigonometric Math Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60K 120K 180K 240K 300K SE +/- 78.25, N = 3 293490.26 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
HPL Linpack OpenBenchmarking.org GFLOPS, More Is Better HPL Linpack 2.3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 140 280 420 560 700 SE +/- 0.39, N = 3 631.30 1. (CC) gcc options: -O2 -lopenblas -lm -lmpi
Compile Bench Test: Compile OpenBenchmarking.org MB/s, More Is Better Compile Bench 0.6 Test: Compile Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 300 600 900 1200 1500 SE +/- 11.78, N = 3 1613.38
Compile Bench Test: Initial Create OpenBenchmarking.org MB/s, More Is Better Compile Bench 0.6 Test: Initial Create Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90 180 270 360 450 SE +/- 4.73, N = 3 399.79
Compile Bench Test: Read Compiled Tree OpenBenchmarking.org MB/s, More Is Better Compile Bench 0.6 Test: Read Compiled Tree Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 600 1200 1800 2400 3000 SE +/- 6.72, N = 3 2657.42
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 120 240 360 480 600 SE +/- 0.45, N = 3 565.16
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 3196
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 3380
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1400 2800 4200 5600 7000 6576
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.02, N = 3 17.56 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.40, N = 15 43.87 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.75, N = 12 43.85 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.99, N = 7 45.57 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30 60 90 120 150 SE +/- 0.09, N = 3 147.63 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.03, N = 3 18.58 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.80, N = 3 8352.57 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 5.36, N = 3 13019.73 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 0.46, N = 3 18283.75 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.69, N = 12 43.54 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.38, N = 12 42.88 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.84, N = 9 44.00 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30 60 90 120 150 SE +/- 0.10, N = 3 151.83 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 14 28 42 56 70 SE +/- 0.35, N = 3 63.08 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.48, N = 3 8390.83 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 1.13, N = 3 13052.60 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 1.84, N = 3 18308.61 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40 80 120 160 200 SE +/- 6.45, N = 12 193.81 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 50 100 150 200 250 SE +/- 2.68, N = 15 212.92 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 50 100 150 200 250 SE +/- 2.80, N = 3 211.93 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.01, N = 3 84.98 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 0.28, N = 3 3144.42 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 0.63, N = 3 3150.78 1. (CXX) g++ options: -O3
Llama.cpp Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 0.10, N = 3 3121.14 1. (CXX) g++ options: -O3
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.21, N = 3 41.91
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.30, N = 3 42.16
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 0.00, N = 3 4096
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.00, N = 3 8192
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13 26 39 52 65 SE +/- 0.61, N = 4 57.78
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 0.00, N = 3 16384
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 32768
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13 26 39 52 65 SE +/- 0.59, N = 3 56.82
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.09, N = 3 24.04
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 0.00, N = 3 4096
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.00, N = 3 8192
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.04, N = 3 24.12
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.107 2.214 3.321 4.428 5.535 SE +/- 0.03, N = 3 4.92
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 0.00, N = 3 16384
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 32768
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.125 2.25 3.375 4.5 5.625 SE +/- 0.02, N = 3 5.00
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 0.00, N = 3 4096
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 0.00, N = 3 8192
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4K 8K 12K 16K 20K SE +/- 0.00, N = 3 16384
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 32768
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 300 600 900 1200 1500 SE +/- 0.00, N = 3 1536
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 0.00, N = 3 3072
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1300 2600 3900 5200 6500 SE +/- 0.00, N = 3 6144
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 0.00, N = 3 12288
spaCy Model: en_core_web_lg OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_lg Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3K 6K 9K 12K 15K SE +/- 33.20, N = 3 15338
spaCy Model: en_core_web_trf OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_trf Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 600 1200 1800 2400 3000 SE +/- 96.13, N = 3 2751
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 248.09, N = 3 176096.16 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11K 22K 33K 44K 55K SE +/- 396.73, N = 15 50419.73 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1400 2800 4200 5600 7000 SE +/- 96.53, N = 15 6692.95 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1600 3200 4800 6400 8000 SE +/- 66.24, N = 7 7273.96 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20K 40K 60K 80K 100K SE +/- 426.91, N = 3 102149.25 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 15.41, N = 3 4064.08 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40K 80K 120K 160K 200K SE +/- 202.80, N = 3 199460.27 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 947.46, N = 3 119325.63 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: SP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30K 60K 90K 120K 150K SE +/- 1283.80, N = 8 146367.34 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20K 40K 60K 80K 100K SE +/- 153.92, N = 3 108596.64 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
Intel MPI Benchmarks Test: IMB-MPI1 Exchange OpenBenchmarking.org Average usec, Fewer Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Exchange Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 3.64, N = 3 108.05 MIN: 1.13 / MAX: 1949.29 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-MPI1 Sendrecv OpenBenchmarking.org Average usec, Fewer Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Sendrecv Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.66, N = 3 79.47 MIN: 0.64 / MAX: 1174.19 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2K 4K 6K 8K 10K SE +/- 57.34, N = 15 8793.87
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1100 2200 3300 4400 5500 SE +/- 33.72, N = 3 5245.27
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8K 16K 24K 32K 40K SE +/- 455.18, N = 3 38062.4
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 16K 32K 48K 64K 80K SE +/- 699.04, N = 7 75527.3
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 39.32, N = 3 3345.85
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 51.59, N = 15 3255.76
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10K 20K 30K 40K 50K SE +/- 312.59, N = 15 45507.6
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1200 2400 3600 4800 6000 SE +/- 71.29, N = 13 5503.91
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1100 2200 3300 4400 5500 SE +/- 29.41, N = 3 5088.14
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception V4 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8K 16K 24K 32K 40K SE +/- 358.95, N = 3 37181.9
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: NASNet Mobile Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 15K 30K 45K 60K 75K SE +/- 562.50, N = 3 69069.7
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Float Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 700 1400 2100 2800 3500 SE +/- 26.31, N = 15 3350.72
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Quant Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 900 1800 2700 3600 4500 SE +/- 12.05, N = 3 4098.48
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 15K 30K 45K 60K 75K SE +/- 1796.42, N = 15 68557.9
PyBench Total For Average Test Times OpenBenchmarking.org Milliseconds, Fewer Is Better PyBench 2018-02-16 Total For Average Test Times Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 160 320 480 640 800 SE +/- 3.06, N = 3 738
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.1862 0.3724 0.5586 0.7448 0.931 SE +/- 0.005050, N = 3 0.827742 MIN: 0.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.07 0.14 0.21 0.28 0.35 SE +/- 0.002020, N = 3 0.310889 MIN: 0.28 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.1515 0.303 0.4545 0.606 0.7575 SE +/- 0.001462, N = 3 0.673369 MIN: 0.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.03403, N = 3 8.66502 MIN: 7.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.2482 0.4964 0.7446 0.9928 1.241 SE +/- 0.00100, N = 3 1.10328 MIN: 1.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 140 280 420 560 700 SE +/- 1.20, N = 3 669.68 MIN: 653.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 90 180 270 360 450 SE +/- 0.88, N = 3 398.49 MIN: 390.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mobilenet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.42, N = 12 23.31 MIN: 20.72 / MAX: 49.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.07, N = 12 11.85 MIN: 11.19 / MAX: 16.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 1.23, N = 12 14.36 MIN: 12.51 / MAX: 1452 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: shufflenet-v2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.05, N = 12 16.76 MIN: 16.18 / MAX: 40.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mnasnet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.07, N = 12 11.52 MIN: 10.69 / MAX: 16.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: efficientnet-b0 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.14, N = 12 16.71 MIN: 14.94 / MAX: 30.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: blazeface Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.05, N = 12 6.39 MIN: 5.9 / MAX: 7.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: googlenet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.26, N = 12 22.65 MIN: 20.21 / MAX: 53.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vgg16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.44, N = 12 42.12 MIN: 38.09 / MAX: 61.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet18 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.15, N = 12 12.69 MIN: 11.43 / MAX: 28.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: alexnet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.07, N = 12 6.32 MIN: 5.67 / MAX: 19.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.25, N = 12 22.82 MIN: 21.27 / MAX: 32.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.42, N = 12 23.31 MIN: 20.72 / MAX: 49.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: yolov4-tiny Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.47, N = 12 31.75 MIN: 28.18 / MAX: 38.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: squeezenet_ssd Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.32, N = 12 23.98 MIN: 21.56 / MAX: 65.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: regnety_400m Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 10 20 30 40 50 SE +/- 0.27, N = 12 45.73 MIN: -425.63 / MAX: 85.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vision_transformer Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13 26 39 52 65 SE +/- 0.39, N = 12 57.21 MIN: 52 / MAX: 568.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: FastestDet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.32, N = 12 17.79 MIN: 16.23 / MAX: 24.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mobilenet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.17, N = 3 23.15 MIN: 22.82 / MAX: 28.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.40, N = 3 11.77 MIN: 11.14 / MAX: 13.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.23, N = 3 13.33 MIN: 12.65 / MAX: 18.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.09, N = 3 16.56 MIN: 16.28 / MAX: 17.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mnasnet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.22, N = 3 11.69 MIN: 11.09 / MAX: 19.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.15, N = 3 16.99 MIN: 16.56 / MAX: 22.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: blazeface Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.05, N = 3 6.35 MIN: 6.16 / MAX: 11.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: googlenet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.24, N = 3 22.25 MIN: 21.82 / MAX: 28.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vgg16 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 9 18 27 36 45 SE +/- 1.07, N = 3 41.41 MIN: 39.19 / MAX: 48.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet18 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.18, N = 3 12.46 MIN: 11.95 / MAX: 13.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: alexnet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.16, N = 3 6.58 MIN: 6.19 / MAX: 11.02 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet50 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.47, N = 3 22.63 MIN: 21.57 / MAX: 30.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.17, N = 3 23.15 MIN: 22.82 / MAX: 28.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.69, N = 3 31.30 MIN: 29.76 / MAX: 37.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.30, N = 3 23.72 MIN: 23.06 / MAX: 29.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 11 22 33 44 55 SE +/- 0.91, N = 3 46.67 MIN: 45.2 / MAX: 88.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 13 26 39 52 65 SE +/- 0.48, N = 3 56.65 MIN: 54.43 / MAX: 62.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: FastestDet Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.64, N = 3 17.29 MIN: 15.96 / MAX: 23.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Glibc Benchmarks Benchmark: cos OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: cos Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 16 32 48 64 80 SE +/- 0.01, N = 3 70.82 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: exp OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: exp Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.00, N = 3 15.14 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: ffs OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: ffs Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.2758 2.5516 3.8274 5.1032 6.379 SE +/- 0.00021, N = 3 5.67032 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: pow OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: pow Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 8 16 24 32 40 SE +/- 0.13, N = 3 35.56 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: sin OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: sin Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 14 28 42 56 70 SE +/- 0.00, N = 3 63.12 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: log2 OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: log2 Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 3 6 9 12 15 SE +/- 0.00, N = 3 10.42 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: modf OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: modf Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.00034, N = 3 6.61957 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: sinh OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: sinh Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.02, N = 3 22.96 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: sqrt OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: sqrt Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 2 4 6 8 10 SE +/- 0.00536, N = 3 8.24574 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: tanh OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: tanh Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 6 12 18 24 30 SE +/- 0.00, N = 3 26.97 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: asinh OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: asinh Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 5 10 15 20 25 SE +/- 0.00, N = 3 22.58 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: atanh OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: atanh Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.00, N = 3 28.18 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: ffsll OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: ffsll Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.2759 2.5518 3.8277 5.1036 6.3795 SE +/- 0.00011, N = 3 5.67055 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: sincos OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: sincos Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 9 18 27 36 45 SE +/- 0.01, N = 3 38.98 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Glibc Benchmarks Benchmark: pthread_once OpenBenchmarking.org ns, Fewer Is Better Glibc Benchmarks 2.39 Benchmark: pthread_once Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 1.2725 2.545 3.8175 5.09 6.3625 SE +/- 0.00045, N = 3 5.65547 1. (CC) gcc options: -pie -nostdlib -nostartfiles -lgcc -lgcc_s
Timed MrBayes Analysis Primate Phylogeny Analysis OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 20 40 60 80 100 SE +/- 0.77, N = 3 81.58 1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm -lreadline
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 80 160 240 320 400 SE +/- 4.25, N = 3 384.19 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Timed GCC Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed GCC Compilation 13.2 Time To Compile Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 200 400 600 800 1000 SE +/- 0.79, N = 3 913.62
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 7 14 21 28 35 SE +/- 0.22, N = 14 30.30
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 60 120 180 240 300 SE +/- 0.84, N = 3 258.64
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 30 60 90 120 150 SE +/- 1.19, N = 3 135.29
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 40 80 120 160 200 SE +/- 1.10, N = 3 198.90
Cython Benchmark Test: N-Queens OpenBenchmarking.org Seconds, Fewer Is Better Cython Benchmark 0.29.21 Test: N-Queens Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 4 8 12 16 20 SE +/- 0.14, N = 3 17.98
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark Ubuntu 22.04 - 2 x AMD EPYC 9274F 24-Core 0.0325 0.065 0.0975 0.13 0.1625 SE +/- 0.0014, N = 3 0.1446 1. R scripting front-end version 4.1.2 (2021-11-01)
Phoronix Test Suite v10.8.5