2 x AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 24.10 via the Phoronix Test Suite.
a Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101148Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
b c Processor: 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1007B BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.10, Kernel: 6.11.0-13-generic (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1920x1200
9684X EOY2024 AMD Linux OpenBenchmarking.org Phoronix Test Suite 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 1520GB 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.11.0-13-generic (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1920x1200 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution 9684X EOY2024 AMD Linux Benchmarks System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101148 - OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10) - Python 3.12.7 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
a b c Result Overview Phoronix Test Suite 100% 108% 117% 125% miniFE NCNN Stockfish Apache Cassandra SVT-AV1 srsRAN Project WarpX Graph500 C-Ray Xcompact3d Incompact3d GROMACS Primesieve OpenVINO GenAI Etcpak Llama.cpp RELION Build2 Epoch Timed FFmpeg Compilation Timed Eigen Compilation Timed Linux Kernel Compilation Intel Open Image Denoise Laghos Blender NAMD x265 Rustls Timed Node.js Compilation Y-Cruncher Palabos 7-Zip Compression ACES DGEMM Timed PHP Compilation OpenSSL WebP Image Encode OpenVINO OSPRay BYTE Unix Benchmark OSPRay Studio LiteRT XNNPACK
9684X EOY2024 AMD Linux etcpak: Multi-Threaded - ETC2 minife: Small namd: ATPase with 327,506 Atoms namd: STMV with 1,066,628 Atoms laghos: Triple Point Problem laghos: Sedov Blast Wave, ube_922_hex.mesh palabos: 100 palabos: 400 palabos: 500 palabos: 1000 epoch: Cone incompact3d: X3D-benchmarking input.i3d incompact3d: input.i3d 129 Cells Per Direction incompact3d: input.i3d 193 Cells Per Direction relion: Basic - CPU warpx: Uniform Plasma warpx: Plasma Acceleration byte: Pipe byte: Dhrystone 2 byte: System Call byte: Whetstone Double webp: Default webp: Quality 100 webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression srsran: PDSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Total svt-av1: Preset 3 - Bosphorus 4K svt-av1: Preset 5 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 3 - Bosphorus 1080p svt-av1: Preset 5 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p x265: Bosphorus 4K x265: Bosphorus 1080p x265: Bosphorus 4K x265: Bosphorus 1080p mt-dgemm: Sustained Floating-Point Rate oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only oidn: RTLightmap.hdr.4096x4096 - CPU-Only ospray: particle_volume/ao/real_time ospray: particle_volume/scivis/real_time ospray: particle_volume/pathtracer/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time compress-7zip: Compression Rating compress-7zip: Decompression Rating compress-7zip: Compression Rating compress-7zip: Decompression Rating stockfish: Chess Benchmark build-ffmpeg: Time To Compile build-linux-kernel: defconfig build-linux-kernel: allmodconfig build-nodejs: Time To Compile build-php: Time To Compile build2: Time To Compile c-ray: 4K - 16 c-ray: 5K - 16 c-ray: 1080p - 16 primesieve: 1e12 primesieve: 1e13 y-cruncher: 1B y-cruncher: 5B y-cruncher: 500M ospray-studio: 1 - 4K - 1 - Path Tracer - CPU ospray-studio: 2 - 4K - 1 - Path Tracer - CPU ospray-studio: 3 - 4K - 1 - Path Tracer - CPU ospray-studio: 1 - 4K - 16 - Path Tracer - CPU ospray-studio: 1 - 4K - 32 - Path Tracer - CPU ospray-studio: 2 - 4K - 16 - Path Tracer - CPU ospray-studio: 2 - 4K - 32 - Path Tracer - CPU ospray-studio: 3 - 4K - 16 - Path Tracer - CPU ospray-studio: 3 - 4K - 32 - Path Tracer - CPU ospray-studio: 1 - 1080p - 1 - Path Tracer - CPU ospray-studio: 2 - 1080p - 1 - Path Tracer - CPU ospray-studio: 3 - 1080p - 1 - Path Tracer - CPU ospray-studio: 1 - 1080p - 16 - Path Tracer - CPU ospray-studio: 1 - 1080p - 32 - Path Tracer - CPU ospray-studio: 2 - 1080p - 16 - Path Tracer - CPU ospray-studio: 2 - 1080p - 32 - Path Tracer - CPU ospray-studio: 3 - 1080p - 16 - Path Tracer - CPU ospray-studio: 3 - 1080p - 32 - Path Tracer - CPU build-eigen: Time To Compile rustls: handshake - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 rustls: handshake - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 rustls: handshake-resume - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 rustls: handshake-ticket - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 rustls: handshake-resume - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 rustls: handshake-ticket - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 openssl: SHA256 openssl: SHA512 openssl: RSA4096 openssl: RSA4096 openssl: ChaCha20 openssl: AES-128-GCM openssl: AES-256-GCM openssl: ChaCha20-Poly1305 graph500: 26 graph500: 26 graph500: 26 graph500: 26 gromacs: MPI CPU - water_GMX50_bare gromacs: water_GMX50_bare litert: DeepLab V3 litert: SqueezeNet litert: Inception V4 litert: NASNet Mobile litert: Mobilenet Float litert: Mobilenet Quant litert: Inception ResNet V2 litert: Quantized COCO SSD MobileNet v1 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet xnnpack: FP32MobileNetV1 xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV1 xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: QS8MobileNetV2 blender: BMW27 - CPU-Only blender: Junkshop - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU cassandra: Writes llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 openvino-genai: Gemma-7b-int4-ov - CPU openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token a b c 875.752 65310.9 17.85166 5.65889 202.39 404.654108843 734.581 530.001 556.406 659.936 455.5 160.910339 1.29827499 3.97435308 76.042 39.29356258 37.69209741 302008825.4 12634644588.3 247897646.3 2713226.7 17.92 11.18 1.44 3.57 0.56 129795.7 7635.4 11.022 39.596 110.454 154.085 28.759 95.439 316.396 551.636 35.86 100.86 34.9 97.18 3788.093096 3.29 3.25 1.57 46.8907 46.8538 202.362 43.3925 42.9096 50.4084 804522 1155891 784684 1178953 512969802 19.901 28.842 206.651 111.738 41.191 73.151 18.262 32.096 5.041 1.147 12.83 9.057 32.809 4.607 638 642 749 10152 20328 10279 20412 11987 23858 162 164 190 2573 5158 2603 5169 3042 6054 40.544 82349.22 399853.9 2839244.84 2073385.56 1698216.97 1390278.58 242548530680 78840468670 74941.9 2833287.3 983205666540 1787346309360 1529363548300 686377188830 1540820000 1653580000 615929000 879749000 19.994 12.691 52254.8 31218 162456 581416 20933.9 20120 254071 29419.1 32.32 22.98 27.96 28.54 21.21 34.25 15.23 38.09 57.29 23.91 10.05 40.59 32.32 41.1 48.87 130.33 62.7 33.63 24800 33117 61147 59877 20169 60203 61636 48621 48535 9 12.3 22.54 11.74 79.23 26.74 95.79 499.71 1004.23 47.76 7461.75 6.42 179.06 267.41 20737.16 2.3 3190.93 15.03 10334.31 4.62 9365.24 20.3 31408.22 6.09 3548.57 13.51 1004.04 47.76 17881.36 10.61 9438.66 5.07 6669.51 25.72 4858.25 39.29 11355.93 4.21 134020.93 0.56 5096.13 37.62 164808.53 0.36 296920 26.13 75.67 74.92 75.93 30.83 69.99 75.97 76.4 54.03 158.95 158.71 160.28 33.63 46.08 29.74 68.29 17.76 14.64 50.58 34.43 19.77 57.19 27.52 17.49 853.032 67739.3 17.85590 5.65973 203.72 397.50 729.043 529.249 556.414 659.837 461.42 163.125946 1.28448296 4.15029907 74.219 41.4413694 39.11805274 302075584.9 12655563306.7 247922303.2 2714473.2 18.05 11.13 1.44 3.59 0.56 132681.9 8082.6 10.927 38.524 121.817 187.688 28.792 94.715 328.083 480.526 35.78 99.95 34.83 98.77 3796.85965 3.25 3.26 1.55 46.9915 46.9547 199.846 43.3994 42.8705 50.5096 789248 1180405 773934 1192798 477502351 20.029 29.001 207.598 111.629 41.222 73.584 18.502 32.067 5.323 1.215 12.809 9.057 32.889 4.656 640 641 751 10163 20294 10233 20522 12038 24050 163 163 190 2575 5143 2585 5196 3035 6085 40.093 82232.23 403088.18 2963208.68 2060813.4 1715553.24 1365832.45 242399788540 78904969300 75010.2 2820653.3 983433981080 1788600410900 1529285362960 685952939270 1524900000 1641600000 610704000 842062000 19.7 12.131 52731.5 30070.9 189474 615123 16633.9 18899.6 258097 27302.5 37.78 23.98 28.66 34.31 23.79 39.4 16.49 43.26 67.68 26.78 11.55 45.84 37.78 48.37 55.33 145.07 75.85 37.5 43522 33453 63696 53472 21882 62409 57237 38085 46034 9.02 12.44 22.59 12 79.31 26.65 95.78 499.92 1002.63 47.84 7462.48 6.41 179.29 267.41 20684.11 2.3 3188.25 15.04 10339.33 4.63 9360.81 20.29 31395.35 6.09 3541.66 13.54 1005.17 47.71 17864.8 10.61 9405.62 5.09 6771.73 26.04 4855.51 39.36 11377.2 4.21 137459.28 0.57 5092.6 37.5 158277.32 0.37 283197 30.39 72.24 79.85 71.43 31.05 65.87 78.14 75.34 54.25 158.65 159.58 154.18 31.8 48.41 31.44 68.45 17.54 14.61 48 35.01 20.84 55.87 27.83 17.9 856.221 50712.3 17.93413 5.71674 204.16 403.45 734.145 531.729 558.061 661.554 458.47 159.310562 1.41423202 4.05741787 74.58 41.02977149 37.78828715 302154327.2 12636880431.6 248008992.6 2714351.2 17.96 11.14 1.44 3.61 0.56 128169.6 7631.4 11.047 39.035 115.683 184.333 28.782 96.098 321.118 636.984 36.03 98.1 35.01 97.06 3782.565125 3.23 3.25 1.56 47.0772 47.0238 201.155 43.1862 42.7973 50.3549 791157 1203970 776559 1181592 557325172 19.759 28.69 205.872 111.247 41.29 72.487 18.223 32.305 4.865 1.145 12.811 9.052 32.825 4.656 641 645 755 10201 20349 10227 20511 11909 23921 163 164 191 2599 5174 2620 5202 3054 6104 40.064 82326.54 402383.42 2869856.83 2035952.96 1729925.11 1392293.46 242005934470 78929554520 74959.3 2877947.2 981347683260 1788351156160 1531952540600 686536891250 1580110000 1727360000 623794000 880729000 19.871 12.292 91444.5 38557.3 187012 543452 25812.1 31618.4 288314 34124.3 40.6 23.6 32.02 32.46 26.06 38.28 17.7 45.04 71.29 27.99 11.91 47.57 40.6 52.04 58.88 146.13 78.66 42.04 24895 45462 56805 53286 26390 59616 54309 47611 60593 9.14 12.41 22.6 11.92 79.94 26.64 95.91 499.76 996.74 48.12 7455.49 6.42 179.45 267.08 20660.92 2.3 3190.21 15.03 10331.07 4.63 9344.75 20.34 31348.09 6.1 3540.87 13.54 998.12 48.05 17734.15 10.69 9401.19 5.09 6951.71 25.39 4862.29 39.26 11360.69 4.21 136174.8 0.57 5085.52 37.54 157712.09 0.36 267019 30.19 78.73 73.96 75.24 30.94 72.06 76.26 77.09 56.6 167.59 163.84 155.34 32.01 47.62 31.24 69.4 17.56 14.41 49.8 35.17 20.08 56.84 27.62 17.59 OpenBenchmarking.org
Etcpak OpenBenchmarking.org Mpx/s, More Is Better Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 a b c 200 400 600 800 1000 875.75 853.03 856.22 1. (CXX) g++ options: -flto -pthread
QuantLib Size: S
a: The test quit with a non-zero exit status. E: Error: INVALID BENCHMARK RUN.
b: The test quit with a non-zero exit status. E: Error: INVALID BENCHMARK RUN.
c: The test quit with a non-zero exit status. E: Error: INVALID BENCHMARK RUN.
Size: XXS
a: The test quit with a non-zero exit status. E: Error: INVALID BENCHMARK RUN.
b: The test quit with a non-zero exit status. E: Error: INVALID BENCHMARK RUN.
c: The test quit with a non-zero exit status. E: Error: INVALID BENCHMARK RUN.
Laghos OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem a b c 40 80 120 160 200 202.39 203.72 204.16 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh a b c 90 180 270 360 450 404.65 397.50 403.45 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Palabos OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 a b c 160 320 480 640 800 734.58 729.04 734.15 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 a b c 110 220 330 440 550 530.00 529.25 531.73 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 a b c 120 240 360 480 600 556.41 556.41 558.06 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 1000 a b c 140 280 420 560 700 659.94 659.84 661.55 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Grid Size: 4000
a: The test quit with a non-zero exit status.
b: The test quit with a non-zero exit status.
c: The test quit with a non-zero exit status.
Epoch OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone a b c 100 200 300 400 500 455.50 461.42 458.47 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d a b c 40 80 120 160 200 160.91 163.13 159.31 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction a b c 0.3182 0.6364 0.9546 1.2728 1.591 1.29827499 1.28448296 1.41423202 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction a b c 0.9338 1.8676 2.8014 3.7352 4.669 3.97435308 4.15029907 4.05741787 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
RELION OpenBenchmarking.org Seconds, Fewer Is Better RELION 5.0 Test: Basic - Device: CPU a b c 20 40 60 80 100 76.04 74.22 74.58 1. (CXX) g++ options: -fPIC -std=c++14 -fopenmp -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -ljpeg -lmpi_cxx -lmpi
OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 a b c 3000M 6000M 9000M 12000M 15000M 12634644588.3 12655563306.7 12636880431.6 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless, Highest Compression a b c 0.126 0.252 0.378 0.504 0.63 0.56 0.56 0.56 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
srsRAN Project OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Total a b c 30K 60K 90K 120K 150K 129795.7 132681.9 128169.6 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Total a b c 2K 4K 6K 8K 10K 7635.4 8082.6 7631.4 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K a b c 3 6 9 12 15 11.02 10.93 11.05 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K a b c 9 18 27 36 45 39.60 38.52 39.04 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b c 30 60 90 120 150 110.45 121.82 115.68 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b c 40 80 120 160 200 154.09 187.69 184.33 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p a b c 7 14 21 28 35 28.76 28.79 28.78 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p a b c 20 40 60 80 100 95.44 94.72 96.10 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b c 70 140 210 280 350 316.40 328.08 321.12 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b c 140 280 420 560 700 551.64 480.53 636.98 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
VVenC Video Input: Bosphorus 4K - Video Preset: Fast
a: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
b: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
c: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
Video Input: Bosphorus 4K - Video Preset: Faster
a: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
b: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
c: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
Video Input: Bosphorus 1080p - Video Preset: Fast
a: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
b: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
c: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
Video Input: Bosphorus 1080p - Video Preset: Faster
a: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
b: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
c: The test quit with a non-zero exit status. E: Parameter Check Error: Number of threads out of range (-1
x265 OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 4K a b c 8 16 24 32 40 35.86 35.78 36.03 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 1080p a b c 20 40 60 80 100 100.86 99.95 98.10 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 4K a b c 8 16 24 32 40 34.90 34.83 35.01 1. x265 [info]: HEVC encoder version 3.6+1-aa7f602f7
OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 1080p a b c 20 40 60 80 100 97.18 98.77 97.06 1. x265 [info]: HEVC encoder version 3.6+1-aa7f602f7
ACES DGEMM OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate a b c 800 1600 2400 3200 4000 3788.09 3796.86 3782.57 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b c 11 22 33 44 55 50.41 50.51 50.35
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression Test: Compression Rating a b c 200K 400K 600K 800K 1000K 784684 773934 776559 1. 7-Zip 24.08 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-08-11
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression Test: Decompression Rating a b c 300K 600K 900K 1200K 1500K 1178953 1192798 1181592 1. 7-Zip 24.08 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-08-11
Stockfish OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark a b c 120M 240M 360M 480M 600M 512969802 477502351 557325172 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
C-Ray OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 4K - Rays Per Pixel: 16 a b c 5 10 15 20 25 18.26 18.50 18.22 1. (CC) gcc options: -lpthread -lm
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 5K - Rays Per Pixel: 16 a b c 8 16 24 32 40 32.10 32.07 32.31 1. (CC) gcc options: -lpthread -lm
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 1080p - Rays Per Pixel: 16 a b c 1.1977 2.3954 3.5931 4.7908 5.9885 5.041 5.323 4.865 1. (CC) gcc options: -lpthread -lm
OSPRay Studio Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b c 140 280 420 560 700 638 640 641
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c 600 1200 1800 2400 3000 2573 2575 2599
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c 1100 2200 3300 4400 5500 5158 5143 5174
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c 600 1200 1800 2400 3000 2603 2585 2620
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c 1100 2200 3300 4400 5500 5169 5196 5202
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c 700 1400 2100 2800 3500 3042 3035 3054
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c 1300 2600 3900 5200 6500 6054 6085 6104
Rustls OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b c 20K 40K 60K 80K 100K 82349.22 82232.23 82326.54 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b c 90K 180K 270K 360K 450K 399853.90 403088.18 402383.42 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b c 600K 1200K 1800K 2400K 3000K 2839244.84 2963208.68 2869856.83 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b c 400K 800K 1200K 1600K 2000K 2073385.56 2060813.40 2035952.96 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b c 400K 800K 1200K 1600K 2000K 1698216.97 1715553.24 1729925.11 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b c 300K 600K 900K 1200K 1500K 1390278.58 1365832.45 1392293.46 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA256 a b c 50000M 100000M 150000M 200000M 250000M 242548530680 242399788540 242005934470 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA512 a b c 20000M 40000M 60000M 80000M 100000M 78840468670 78904969300 78929554520 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 a b c 16K 32K 48K 64K 80K 74941.9 75010.2 74959.3 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 a b c 600K 1200K 1800K 2400K 3000K 2833287.3 2820653.3 2877947.2 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20 a b c 200000M 400000M 600000M 800000M 1000000M 983205666540 983433981080 981347683260 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-128-GCM a b c 400000M 800000M 1200000M 1600000M 2000000M 1787346309360 1788600410900 1788351156160 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-256-GCM a b c 300000M 600000M 900000M 1200000M 1500000M 1529363548300 1529285362960 1531952540600 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20-Poly1305 a b c 150000M 300000M 450000M 600000M 750000M 686377188830 685952939270 686536891250 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
Graph500 This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org bfs median_TEPS, More Is Better Graph500 3.0 Scale: 26 a b c 300M 600M 900M 1200M 1500M 1540820000 1524900000 1580110000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org bfs max_TEPS, More Is Better Graph500 3.0 Scale: 26 a b c 400M 800M 1200M 1600M 2000M 1653580000 1641600000 1727360000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 a b c 130M 260M 390M 520M 650M 615929000 610704000 623794000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org sssp max_TEPS, More Is Better Graph500 3.0 Scale: 26 a b c 200M 400M 600M 800M 1000M 879749000 842062000 880729000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare a b c 5 10 15 20 25 19.99 19.70 19.87 1. (CXX) g++ options: -O3 -lm
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare a b c 3 6 9 12 15 12.69 12.13 12.29 1. GROMACS version: 2024.2-Ubuntu_2024.2_1
NCNN OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mobilenet a b c 9 18 27 36 45 32.32 37.78 40.60 MIN: 32.09 / MAX: 37.05 MIN: 37.49 / MAX: 49.72 MIN: 40.3 / MAX: 42.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 a b c 6 12 18 24 30 22.98 23.98 23.60 MIN: 22.48 / MAX: 24.91 MIN: 23.73 / MAX: 32.78 MIN: 23.33 / MAX: 31.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 a b c 7 14 21 28 35 27.96 28.66 32.02 MIN: 25.07 / MAX: 299.42 MIN: 25.96 / MAX: 214.05 MIN: 26.22 / MAX: 360.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: shufflenet-v2 a b c 8 16 24 32 40 28.54 34.31 32.46 MIN: 28.22 / MAX: 31 MIN: 29.7 / MAX: 325.64 MIN: 30.3 / MAX: 287 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mnasnet a b c 6 12 18 24 30 21.21 23.79 26.06 MIN: 20.88 / MAX: 61.55 MIN: 23.02 / MAX: 25.23 MIN: 24 / MAX: 282.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: efficientnet-b0 a b c 9 18 27 36 45 34.25 39.40 38.28 MIN: 32.27 / MAX: 339.49 MIN: 36.7 / MAX: 209.1 MIN: 37.89 / MAX: 41.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: blazeface a b c 4 8 12 16 20 15.23 16.49 17.70 MIN: 15.12 / MAX: 16.82 MIN: 16.39 / MAX: 17.8 MIN: 17.07 / MAX: 21.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: googlenet a b c 10 20 30 40 50 38.09 43.26 45.04 MIN: 37.88 / MAX: 45.5 MIN: 43.04 / MAX: 44.82 MIN: 44.77 / MAX: 52.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vgg16 a b c 16 32 48 64 80 57.29 67.68 71.29 MIN: 56.41 / MAX: 59.7 MIN: 65.28 / MAX: 71.33 MIN: 69.89 / MAX: 75.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet18 a b c 7 14 21 28 35 23.91 26.78 27.99 MIN: 23.77 / MAX: 25.2 MIN: 26.55 / MAX: 34.81 MIN: 27.8 / MAX: 29.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: alexnet a b c 3 6 9 12 15 10.05 11.55 11.91 MIN: 9.71 / MAX: 11.49 MIN: 11.26 / MAX: 13 MIN: 11.57 / MAX: 13.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet50 a b c 11 22 33 44 55 40.59 45.84 47.57 MIN: 40.35 / MAX: 42.4 MIN: 45.14 / MAX: 142.04 MIN: 47.3 / MAX: 49.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 a b c 9 18 27 36 45 32.32 37.78 40.60 MIN: 32.09 / MAX: 37.05 MIN: 37.49 / MAX: 49.72 MIN: 40.3 / MAX: 42.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: yolov4-tiny a b c 12 24 36 48 60 41.10 48.37 52.04 MIN: 40.13 / MAX: 48.66 MIN: 44.62 / MAX: 227.83 MIN: 47.36 / MAX: 54.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: squeezenet_ssd a b c 13 26 39 52 65 48.87 55.33 58.88 MIN: 48.32 / MAX: 57.97 MIN: 54.25 / MAX: 188.52 MIN: 58.23 / MAX: 62.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: regnety_400m a b c 30 60 90 120 150 130.33 145.07 146.13 MIN: 129.83 / MAX: 134.5 MIN: 142 / MAX: 364.83 MIN: 144.99 / MAX: 148.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vision_transformer a b c 20 40 60 80 100 62.70 75.85 78.66 MIN: 61.66 / MAX: 71.71 MIN: 71.67 / MAX: 100.04 MIN: 74.66 / MAX: 141.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: FastestDet a b c 10 20 30 40 50 33.63 37.50 42.04 MIN: 33.37 / MAX: 41.47 MIN: 36.47 / MAX: 39.01 MIN: 40.1 / MAX: 43.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenVINO OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b c 20 40 60 80 100 95.79 95.78 95.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b c 110 220 330 440 550 499.71 499.92 499.76 MIN: 449.21 / MAX: 571.54 MIN: 479.24 / MAX: 552.53 MIN: 466.23 / MAX: 522.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b c 200 400 600 800 1000 1004.23 1002.63 996.74 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b c 11 22 33 44 55 47.76 47.84 48.12 MIN: 42.27 / MAX: 95.66 MIN: 40.4 / MAX: 93.82 MIN: 43.74 / MAX: 95.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b c 1600 3200 4800 6400 8000 7461.75 7462.48 7455.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b c 2 4 6 8 10 6.42 6.41 6.42 MIN: 4.91 / MAX: 29.23 MIN: 4.86 / MAX: 30.37 MIN: 5.39 / MAX: 28.39 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b c 40 80 120 160 200 179.06 179.29 179.45 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b c 60 120 180 240 300 267.41 267.41 267.08 MIN: 216.32 / MAX: 362.62 MIN: 244.77 / MAX: 328.68 MIN: 243.4 / MAX: 286.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b c 4K 8K 12K 16K 20K 20737.16 20684.11 20660.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b c 0.5175 1.035 1.5525 2.07 2.5875 2.3 2.3 2.3 MIN: 1.81 / MAX: 19.11 MIN: 1.94 / MAX: 25.03 MIN: 1.84 / MAX: 19.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b c 700 1400 2100 2800 3500 3190.93 3188.25 3190.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b c 4 8 12 16 20 15.03 15.04 15.03 MIN: 13.37 / MAX: 48.6 MIN: 13.62 / MAX: 54.33 MIN: 13.1 / MAX: 58.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c 2K 4K 6K 8K 10K 10334.31 10339.33 10331.07 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c 1.0418 2.0836 3.1254 4.1672 5.209 4.62 4.63 4.63 MIN: 3.61 / MAX: 26.86 MIN: 3.64 / MAX: 25.83 MIN: 3.65 / MAX: 27.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b c 2K 4K 6K 8K 10K 9365.24 9360.81 9344.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b c 5 10 15 20 25 20.30 20.29 20.34 MIN: 18.42 / MAX: 53.61 MIN: 17.92 / MAX: 98.47 MIN: 17.23 / MAX: 77.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b c 7K 14K 21K 28K 35K 31408.22 31395.35 31348.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b c 2 4 6 8 10 6.09 6.09 6.10 MIN: 5.01 / MAX: 32 MIN: 4.9 / MAX: 31.14 MIN: 4.96 / MAX: 28.29 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b c 800 1600 2400 3200 4000 3548.57 3541.66 3540.87 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b c 3 6 9 12 15 13.51 13.54 13.54 MIN: 11.72 / MAX: 45 MIN: 12.3 / MAX: 45.77 MIN: 11.78 / MAX: 41.47 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b c 200 400 600 800 1000 1004.04 1005.17 998.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b c 11 22 33 44 55 47.76 47.71 48.05 MIN: 36.58 / MAX: 94.91 MIN: 39.39 / MAX: 104.16 MIN: 35.89 / MAX: 151.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c 4K 8K 12K 16K 20K 17881.36 17864.80 17734.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c 3 6 9 12 15 10.61 10.61 10.69 MIN: 8.51 / MAX: 44.19 MIN: 8.75 / MAX: 49.94 MIN: 8.86 / MAX: 55.26 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c 2K 4K 6K 8K 10K 9438.66 9405.62 9401.19 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c 1.1453 2.2906 3.4359 4.5812 5.7265 5.07 5.09 5.09 MIN: 4.31 / MAX: 24 MIN: 4.4 / MAX: 27.98 MIN: 4.35 / MAX: 26.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b c 1500 3000 4500 6000 7500 6669.51 6771.73 6951.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b c 6 12 18 24 30 25.72 26.04 25.39 MIN: 10.36 / MAX: 59.49 MIN: 9.89 / MAX: 73.93 MIN: 10.78 / MAX: 87.72 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b c 1000 2000 3000 4000 5000 4858.25 4855.51 4862.29 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b c 9 18 27 36 45 39.29 39.36 39.26 MIN: 32.59 / MAX: 69.46 MIN: 33.81 / MAX: 75.48 MIN: 31.98 / MAX: 70.5 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b c 2K 4K 6K 8K 10K 11355.93 11377.20 11360.69 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b c 0.9473 1.8946 2.8419 3.7892 4.7365 4.21 4.21 4.21 MIN: 3.52 / MAX: 21.19 MIN: 3.83 / MAX: 23.73 MIN: 3.57 / MAX: 20.4 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c 30K 60K 90K 120K 150K 134020.93 137459.28 136174.80 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c 0.1283 0.2566 0.3849 0.5132 0.6415 0.56 0.57 0.57 MIN: 0.51 / MAX: 20.56 MIN: 0.51 / MAX: 45.57 MIN: 0.51 / MAX: 18.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b c 1100 2200 3300 4400 5500 5096.13 5092.60 5085.52 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b c 9 18 27 36 45 37.62 37.50 37.54 MIN: 31.29 / MAX: 56.35 MIN: 29.82 / MAX: 58.47 MIN: 32.61 / MAX: 58.23 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c 40K 80K 120K 160K 200K 164808.53 158277.32 157712.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c 0.0833 0.1666 0.2499 0.3332 0.4165 0.36 0.37 0.36 MIN: 0.33 / MAX: 45.82 MIN: 0.33 / MAX: 45.02 MIN: 0.33 / MAX: 47.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Llama.cpp OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b c 7 14 21 28 35 26.13 30.39 30.19 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 a b c 20 40 60 80 100 75.67 72.24 78.73 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b c 20 40 60 80 100 74.92 79.85 73.96 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 a b c 20 40 60 80 100 75.93 71.43 75.24 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 a b c 7 14 21 28 35 30.83 31.05 30.94 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 a b c 16 32 48 64 80 69.99 65.87 72.06 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b c 20 40 60 80 100 75.97 78.14 76.26 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 a b c 20 40 60 80 100 76.40 75.34 77.09 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a b c 13 26 39 52 65 54.03 54.25 56.60 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b c 40 80 120 160 200 158.95 158.65 167.59 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b c 40 80 120 160 200 158.71 159.58 163.84 1. (CXX) g++ options: -O3
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 a b c 40 80 120 160 200 160.28 154.18 155.34 1. (CXX) g++ options: -O3
a Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101148Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 29 December 2024 18:54 by user phoronix.
b Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101148Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 29 December 2024 23:48 by user phoronix.
c Processor: 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1007B BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.10, Kernel: 6.11.0-13-generic (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101148Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 30 December 2024 13:48 by user phoronix.