9575F smoke Benchmarks for a future article. AMD EPYC 9575F 64-Core testing with a Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) and ASPEED on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2411283-NE-9575FSMOK49&sor .
9575F smoke Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution a b AMD EPYC 9575F 64-Core @ 3.30GHz (64 Cores / 128 Threads) Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) AMD 1Ah 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.12.0-rc7-linux-pm-next-phx (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 Java Details - OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10) Python Details - Python 3.12.7 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
9575F smoke namd: ATPase with 327,506 Atoms namd: STMV with 1,066,628 Atoms laghos: Triple Point Problem laghos: Sedov Blast Wave, ube_922_hex.mesh palabos: 100 palabos: 400 palabos: 500 byte: Pipe byte: Dhrystone 2 byte: System Call byte: Whetstone Double svt-av1: Preset 3 - Bosphorus 4K svt-av1: Preset 5 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 3 - Bosphorus 1080p svt-av1: Preset 5 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p mt-dgemm: Sustained Floating-Point Rate primesieve: 1e12 primesieve: 1e13 build-eigen: Time To Compile astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive astcenc: Very Thorough litert: DeepLab V3 litert: SqueezeNet litert: Inception V4 litert: NASNet Mobile litert: Mobilenet Float litert: Mobilenet Quant litert: Inception ResNet V2 litert: Quantized COCO SSD MobileNet v1 xnnpack: FP32MobileNetV1 xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV1 xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: QS8MobileNetV2 blender: BMW27 - CPU-Only blender: Junkshop - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU cassandra: Writes llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 openvino-genai: Gemma-7b-int4-ov - CPU openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token a b 12.58888 3.75360 308.92 579.090715221 812.285 734.242 776.31 379708473.5 6028934961.9 360905676.5 1440313.3 17.033 60.729 202.154 481.968 44.809 147.926 529.47 1299.731 5201.055924 1.973 24.78 28.465 1077.777 514.6675 72.4097 6.2161 10.1228 16007.5 3814.42 24673.3 218197 2429.06 19415 39371.7 8744.91 2417 4725 7242 5413 2470 4692 7099 5446 5041 15.15 20.13 41.71 21.26 148.59 47.43 70.11 455.23 656.29 48.68 653.6 48.88 4634.75 6.86 140.59 227.21 15364.86 2.03 1812.32 17.6 7561.85 4.21 6826.87 9.31 22034.49 2.83 2316.75 13.76 781.09 40.94 13617.58 4.64 6219.36 5.09 6485.97 9.56 3054.87 20.93 9322.35 3.41 137029.68 0.34 3111.82 20.55 166477.75 0.26 464181 51.84 313.79 334.84 319.54 54.23 317.19 335.62 321.82 120.87 1013.03 1055.54 965.44 49.16 30.36 20.34 77.27 15.82 12.94 57.82 31.28 17.3 65.33 23.34 15.31 12.64614 3.79737 295.65 567.84 804.119 732.406 772.497 378638216.8 6060068455.1 360867935.4 1440301.8 17.044 60.767 202.429 478.84 44.722 148.163 534.212 1316.946 5203.826684 1.975 24.83 28.37 1135.1168 514.2389 72.3404 6.2233 10.1255 17473.4 3857.68 24727.4 221184 2424.82 18266.4 38276.5 9116.47 2432 4743 7220 5406 2471 4809 7082 5417 5057 15.01 20.25 41.46 21.27 148.6 47.58 70.02 455.56 651.86 49.01 653.07 48.92 4609.7 6.9 140.42 227.4 15466.24 2.02 1856.61 17.18 7540.16 4.22 6821.09 9.31 21941.81 2.83 2326.58 13.71 781.32 40.93 13596.14 4.63 6221.54 5.09 6492.43 9.55 3053 20.95 9309.66 3.42 137431.92 0.34 3106.45 20.59 164424.12 0.26 491661 51.65 296.41 325.9 328.07 54.01 317.46 331.38 327.81 121.27 998.2 1056.46 962.07 49.61 30.16 20.16 77.25 15.9 12.95 55.78 29.27 17.93 65.91 21.91 15.17 OpenBenchmarking.org
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms b a 3 6 9 12 15 12.65 12.59
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms b a 0.8544 1.7088 2.5632 3.4176 4.272 3.79737 3.75360
Laghos Test: Triple Point Problem OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem a b 70 140 210 280 350 308.92 295.65 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Laghos Test: Sedov Blast Wave, ube_922_hex.mesh OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh a b 130 260 390 520 650 579.09 567.84 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Palabos Grid Size: 100 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 a b 200 400 600 800 1000 812.29 804.12 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Palabos Grid Size: 400 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 a b 160 320 480 640 800 734.24 732.41 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Palabos Grid Size: 500 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 a b 200 400 600 800 1000 776.31 772.50 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
BYTE Unix Benchmark Computational Test: Pipe OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe a b 80M 160M 240M 320M 400M 379708473.5 378638216.8 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 b a 1300M 2600M 3900M 5200M 6500M 6060068455.1 6028934961.9 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: System Call OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: System Call a b 80M 160M 240M 320M 400M 360905676.5 360867935.4 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Whetstone Double OpenBenchmarking.org MWIPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double a b 300K 600K 900K 1200K 1500K 1440313.3 1440301.8 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K b a 4 8 12 16 20 17.04 17.03 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K b a 14 28 42 56 70 60.77 60.73 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K b a 40 80 120 160 200 202.43 202.15 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 100 200 300 400 500 481.97 478.84 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p a b 10 20 30 40 50 44.81 44.72 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p b a 30 60 90 120 150 148.16 147.93 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p b a 120 240 360 480 600 534.21 529.47 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p b a 300 600 900 1200 1500 1316.95 1299.73 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate b a 1100 2200 3300 4400 5500 5203.83 5201.06 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e12 a b 0.4444 0.8888 1.3332 1.7776 2.222 1.973 1.975 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e13 a b 6 12 18 24 30 24.78 24.83 1. (CXX) g++ options: -O3
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.4.0 Time To Compile b a 7 14 21 28 35 28.37 28.47
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast b a 200 400 600 800 1000 1135.12 1077.78 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium a b 110 220 330 440 550 514.67 514.24 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough a b 16 32 48 64 80 72.41 72.34 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive b a 2 4 6 8 10 6.2233 6.2161 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough b a 3 6 9 12 15 10.13 10.12 1. (CXX) g++ options: -O3 -flto -pthread
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b 4K 8K 12K 16K 20K 16007.5 17473.4
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b 800 1600 2400 3200 4000 3814.42 3857.68
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b 5K 10K 15K 20K 25K 24673.3 24727.4
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b 50K 100K 150K 200K 250K 218197 221184
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float b a 500 1000 1500 2000 2500 2424.82 2429.06
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant b a 4K 8K 12K 16K 20K 18266.4 19415.0
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 b a 8K 16K 24K 32K 40K 38276.5 39371.7
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b 2K 4K 6K 8K 10K 8744.91 9116.47
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b 500 1000 1500 2000 2500 2417 2432 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b 1000 2000 3000 4000 5000 4725 4743 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large b a 1600 3200 4800 6400 8000 7220 7242 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small b a 1200 2400 3600 4800 6000 5406 5413 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b 500 1000 1500 2000 2500 2470 2471 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b 1000 2000 3000 4000 5000 4692 4809 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large b a 1500 3000 4500 6000 7500 7082 7099 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small b a 1200 2400 3600 4800 6000 5417 5446 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b 1100 2200 3300 4400 5500 5041 5057 1. (CXX) g++ options: -O3 -lrt -lm
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: CPU-Only b a 4 8 12 16 20 15.01 15.15
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: CPU-Only a b 5 10 15 20 25 20.13 20.25
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: CPU-Only b a 10 20 30 40 50 41.46 41.71
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: CPU-Only a b 5 10 15 20 25 21.26 21.27
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only a b 30 60 90 120 150 148.59 148.60
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 11 22 33 44 55 47.43 47.58
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b 16 32 48 64 80 70.11 70.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b 100 200 300 400 500 455.23 455.56 MIN: 379.27 / MAX: 470.62 MIN: 358.53 / MAX: 467.53 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 140 280 420 560 700 656.29 651.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 11 22 33 44 55 48.68 49.01 MIN: 35.62 / MAX: 80.7 MIN: 37.75 / MAX: 78.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU a b 140 280 420 560 700 653.60 653.07 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU a b 11 22 33 44 55 48.88 48.92 MIN: 21.94 / MAX: 80.52 MIN: 36.36 / MAX: 80.16 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b 1000 2000 3000 4000 5000 4634.75 4609.70 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b 2 4 6 8 10 6.86 6.90 MIN: 3.31 / MAX: 17.14 MIN: 2.86 / MAX: 15.25 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 30 60 90 120 150 140.59 140.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 50 100 150 200 250 227.21 227.40 MIN: 213.43 / MAX: 250.31 MIN: 190.49 / MAX: 248.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU b a 3K 6K 9K 12K 15K 15466.24 15364.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU b a 0.4568 0.9136 1.3704 1.8272 2.284 2.02 2.03 MIN: 1.16 / MAX: 5.27 MIN: 1.15 / MAX: 5.23 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU b a 400 800 1200 1600 2000 1856.61 1812.32 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU b a 4 8 12 16 20 17.18 17.60 MIN: 11.6 / MAX: 32.37 MIN: 8.57 / MAX: 30.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 1600 3200 4800 6400 8000 7561.85 7540.16 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 0.9495 1.899 2.8485 3.798 4.7475 4.21 4.22 MIN: 2.44 / MAX: 16.23 MIN: 2.1 / MAX: 17.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 1500 3000 4500 6000 7500 6826.87 6821.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 3 6 9 12 15 9.31 9.31 MIN: 4.57 / MAX: 24.34 MIN: 4.31 / MAX: 25.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b 5K 10K 15K 20K 25K 22034.49 21941.81 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b 0.6368 1.2736 1.9104 2.5472 3.184 2.83 2.83 MIN: 1.66 / MAX: 11.71 MIN: 1.62 / MAX: 9.1 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU b a 500 1000 1500 2000 2500 2326.58 2316.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU b a 4 8 12 16 20 13.71 13.76 MIN: 8.65 / MAX: 29.04 MIN: 7.69 / MAX: 27.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU b a 200 400 600 800 1000 781.32 781.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU b a 9 18 27 36 45 40.93 40.94 MIN: 22.52 / MAX: 59.28 MIN: 22.77 / MAX: 58.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 3K 6K 9K 12K 15K 13617.58 13596.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b a 1.044 2.088 3.132 4.176 5.22 4.63 4.64 MIN: 2.41 / MAX: 14.26 MIN: 2.41 / MAX: 12.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU b a 1300 2600 3900 5200 6500 6221.54 6219.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 1.1453 2.2906 3.4359 4.5812 5.7265 5.09 5.09 MIN: 3.2 / MAX: 15.27 MIN: 3.24 / MAX: 12.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU b a 1400 2800 4200 5600 7000 6492.43 6485.97 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU b a 3 6 9 12 15 9.55 9.56 MIN: 6.24 / MAX: 20.75 MIN: 6.01 / MAX: 21.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 700 1400 2100 2800 3500 3054.87 3053.00 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 5 10 15 20 25 20.93 20.95 MIN: 15.01 / MAX: 31.5 MIN: 13.73 / MAX: 32.59 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 2K 4K 6K 8K 10K 9322.35 9309.66 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 0.7695 1.539 2.3085 3.078 3.8475 3.41 3.42 MIN: 2.12 / MAX: 10.97 MIN: 1.64 / MAX: 10.95 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU b a 30K 60K 90K 120K 150K 137431.92 137029.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 0.0765 0.153 0.2295 0.306 0.3825 0.34 0.34 MIN: 0.16 / MAX: 10.05 MIN: 0.15 / MAX: 10.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 700 1400 2100 2800 3500 3111.82 3106.45 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 5 10 15 20 25 20.55 20.59 MIN: 13.3 / MAX: 28.62 MIN: 13.03 / MAX: 36.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 40K 80K 120K 160K 200K 166477.75 164424.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 0.0585 0.117 0.1755 0.234 0.2925 0.26 0.26 MIN: 0.11 / MAX: 18.01 MIN: 0.11 / MAX: 19.06 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 5.0 Test: Writes b a 110K 220K 330K 440K 550K 491661 464181
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b 12 24 36 48 60 51.84 51.65 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 a b 70 140 210 280 350 313.79 296.41 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b 70 140 210 280 350 334.84 325.90 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 b a 70 140 210 280 350 328.07 319.54 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 a b 12 24 36 48 60 54.23 54.01 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 b a 70 140 210 280 350 317.46 317.19 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b 70 140 210 280 350 335.62 331.38 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 b a 70 140 210 280 350 327.81 321.82 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 b a 30 60 90 120 150 121.27 120.87 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b 200 400 600 800 1000 1013.03 998.20 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 b a 200 400 600 800 1000 1056.46 1055.54 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 a b 200 400 600 800 1000 965.44 962.07 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU b a 11 22 33 44 55 49.61 49.16
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token b a 7 14 21 28 35 30.16 30.36
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token b a 5 10 15 20 25 20.16 20.34
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU a b 20 40 60 80 100 77.27 77.25
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token a b 4 8 12 16 20 15.82 15.90
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token a b 3 6 9 12 15 12.94 12.95
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU a b 13 26 39 52 65 57.82 55.78
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token b a 7 14 21 28 35 29.27 31.28
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token a b 4 8 12 16 20 17.30 17.93
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU b a 15 30 45 60 75 65.91 65.33
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token b a 6 12 18 24 30 21.91 23.34
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token b a 4 8 12 16 20 15.17 15.31
Phoronix Test Suite v10.8.5