5950X LLVM Clang 15 AMD Ryzen 9 5950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (4006 BIOS) and AMD Radeon RX 6800 16GB on Ubuntu 22.04 via the Phoronix Test Suite. Benchmarks for a future article by Michael Larabel.
HTML result view exported from: https://openbenchmarking.org/result/2209177-NE-5950XLLVM12&grs&sro .
5950X LLVM Clang 15 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution Clang 14 Clang 15 AMD Ryzen 9 5950X 16-Core @ 3.40GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (4006 BIOS) AMD Starship/Matisse 32GB 500GB Western Digital WDS500G3X0C-00SJG0 AMD Radeon RX 6800 16GB (2475/1000MHz) AMD Navi 21 HDMI Audio ASUS MG28U Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 22.04 6.0.0-060000rc5daily20220915-generic (x86_64) GNOME Shell 42.4 X Server 1.21.1.3 + Wayland 4.6 Mesa 22.3.0-devel (git-03294e1 2022-09-16 jammy-oibaf-ppa) (LLVM 14.0.6 DRM 3.48) 1.3.228 Clang 14.0.6-1~oibaf~j ext4 3840x2160 Clang 15.0.1-++20220915084339+3637f345d2ab-1~exp1~20220915084350.58 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Environment Details - CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto" Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201016 Python Details - Python 3.10.4 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
5950X LLVM Clang 15 jpegxl: PNG - 7 tnn: CPU - DenseNet tnn: CPU - MobileNet v2 liquid-dsp: 16 - 256 - 57 openssl: SHA256 srsran: OFDM_Test coremark: CoreMark Size 666 - Iterations Per Second webp: Quality 100, Lossless srsran: 4G PHY_DL_Test 100 PRB MIMO 256-QAM encode-flac: WAV To FLAC sqlite-speedtest: Timed Time - Size 1,000 openjpeg: NASA Curiosity Panorama M34 povray: Trace Time aobench: 2048 x 2048 - Total Time srsran: 4G PHY_DL_Test 100 PRB MIMO 256-QAM srsran: 4G PHY_DL_Test 100 PRB SISO 256-QAM x265: Bosphorus 4K srsran: 5G PHY_DL_NR Test 52 PRB SISO 64-QAM encode-mp3: WAV To MP3 quantlib: aom-av1: Speed 8 Realtime - Bosphorus 4K toktx: Zstd Compression 9 svt-vp9: VMAF Optimized - Bosphorus 4K compress-zstd: 19, Long Mode - Decompression Speed lczero: BLAS svt-vp9: PSNR/SSIM Optimized - Bosphorus 4K tscp: AI Chess Performance ncnn: CPU - resnet50 aom-av1: Speed 9 Realtime - Bosphorus 4K ncnn: CPU - googlenet nettle: sha512 encode-opus: WAV To Opus Encode openfoam: motorBike - Mesh Time ncnn: CPU - regnety_400m libraw: Post-Processing Benchmark srsran: 4G PHY_DL_Test 100 PRB MIMO 64-QAM srsran: 4G PHY_DL_Test 100 PRB SISO 64-QAM svt-hevc: 10 - Bosphorus 4K aom-av1: Speed 10 Realtime - Bosphorus 4K ncnn: CPU - mobilenet srsran: 5G PHY_DL_NR Test 52 PRB SISO 64-QAM svt-av1: Preset 10 - Bosphorus 4K liquid-dsp: 32 - 256 - 57 svt-hevc: 7 - Bosphorus 4K nettle: chacha srsran: 4G PHY_DL_Test 100 PRB SISO 256-QAM tnn: CPU - SqueezeNet v2 openfoam: drivaerFastback, Small Mesh Size - Execution Time toktx: UASTC 3 + Zstd Compression 19 jpegxl: JPEG - 7 blosc: blosclz bitshuffle xmrig: Monero - 1M openfoam: motorBike - Execution Time xmrig: Wownero - 1M srsran: 4G PHY_DL_Test 100 PRB SISO 64-QAM graphics-magick: HWB Color Space openfoam: drivaerFastback, Small Mesh Size - Mesh Time ncnn: CPU - alexnet vosk: tnn: CPU - SqueezeNet v1.1 nettle: aes256 primesieve: 1e12 ncnn: CPU - vgg16 nettle: poly1305-aes ncnn: CPU - vision_transformer ncnn: CPU - efficientnet-b0 openssl: RSA4096 ncnn: CPU - mnasnet draco: Lion ncnn: CPU - shufflenet-v2 blosc: blosclz shuffle svt-av1: Preset 12 - Bosphorus 4K ncnn: CPU - resnet18 openssl: RSA4096 compress-zstd: 3 - Compression Speed financebench: Repo OpenMP svt-av1: Preset 8 - Bosphorus 4K ncnn: CPU-v2-v2 - mobilenet-v2 toktx: UASTC 3 toktx: UASTC 4 + Zstd Compression 19 financebench: Bonds OpenMP compress-zstd: 19 - Compression Speed astcenc: Exhaustive astcenc: Medium asmfish: 1024 Hash Memory, 26 Depth c-ray: Total Time - 4K, 16 Rays Per Pixel astcenc: Thorough toktx: Zstd Compression 19 ncnn: CPU-v3-v3 - mobilenet-v3 compress-zstd: 19, Long Mode - Compression Speed Clang 14 Clang 15 11.60 3831.699 472.686 1054466667 29893817370 177200000 706551.876388 1.74 199.8 13.076 45.399 51408 22.771 29.745 557.4 211.4 26.53 79.9 5.877 3744.6 53.09 2.161 50.12 3957.9 761 54.40 2239871 20.11 69.71 11.07 737.49 5.433 37.0637 11.36 59.69 533.9 526.2 85.46 69.61 11.37 178.1 82.263 1115233333 64.72 1198.85 563.4 64.574 371.08421 10.572 103.02 14077.1 7012.9 103.198 13589.4 201.8 1405 37.263986 7.38 14.455 304.932 7286.24 10.536 46.68 3398.23 98.83 4.93 316387.7 3.72 3909 3.94 19848.8 104.372 11.47 4806.6 5138.6 26297.793620 48.511 3.99 6.502 158.341 40300.998698 50.2 1.3239 103.3708 59241934 26.002 12.8069 14.506 3.52 36.3 14.82 3236.767 426.460 976266667 27681428940 166466667 665975.932895 1.84 210.0 13.724 43.369 49157 21.809 28.562 579.4 218.3 27.37 82.4 6.060 3848.9 54.50 2.107 51.39 3860.8 778 55.56 2194544 19.71 71.09 10.86 751.68 5.537 37.747 11.16 60.73 543.1 535.1 86.80 70.70 11.54 180.7 81.116 1100700000 63.89 1214.34 570.1 65.317 375.34359 10.455 104.16 14221.4 7081.8 104.203 13469.2 203.6 1417 37.575448 7.32 14.346 302.640 7339.72 10.606 46.99 3419.90 99.45 4.90 314488.2 3.70 3930 3.92 19936.9 103.914 11.51 4790.2 5121.1 26383.033984 48.376 4.00 6.486 158.720 40390.201823 50.3 1.3214 103.5381 59326632 25.966 12.7944 14.514 3.52 36.3 OpenBenchmarking.org
JPEG XL libjxl Input: PNG - Encode Speed: 7 OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 7 Clang 14 Clang 15 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 11.60 14.82 1. (CXX) g++ options: -O3 -march=native -flto -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet Clang 14 Clang 15 800 1600 2400 3200 4000 SE +/- 2.95, N = 3 SE +/- 3.00, N = 3 3831.70 3236.77 MIN: 3738.1 / MAX: 3946.4 MIN: 3140.1 / MAX: 3351.74 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 Clang 14 Clang 15 100 200 300 400 500 SE +/- 0.95, N = 3 SE +/- 0.20, N = 3 472.69 426.46 MIN: 470.05 / MAX: 476.68 MIN: 424.42 / MAX: 427.81 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 Clang 14 Clang 15 200M 400M 600M 800M 1000M SE +/- 866666.67, N = 3 SE +/- 859638.17, N = 3 1054466667 976266667 1. (CC) gcc options: -O3 -march=native -flto -pthread -lm -lc -lliquid
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 Clang 14 Clang 15 6000M 12000M 18000M 24000M 30000M SE +/- 16320548.29, N = 3 SE +/- 46992215.82, N = 3 29893817370 27681428940 1. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -flto -lssl -lcrypto -ldl
srsRAN Test: OFDM_Test OpenBenchmarking.org Samples / Second, More Is Better srsRAN 22.04.1 Test: OFDM_Test Clang 14 Clang 15 40M 80M 120M 160M 200M SE +/- 1946792.23, N = 3 SE +/- 635959.47, N = 3 177200000 166466667 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second Clang 14 Clang 15 150K 300K 450K 600K 750K SE +/- 996.03, N = 3 SE +/- 996.85, N = 3 706551.88 665975.93 1. (CC) gcc options: -O2 -O3 -march=native -flto -lrt" -lrt
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless Clang 14 Clang 15 0.414 0.828 1.242 1.656 2.07 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 1.74 1.84 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -flto -lm
srsRAN Test: 4G PHY_DL_Test 100 PRB MIMO 256-QAM OpenBenchmarking.org UE Mb/s, More Is Better srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB MIMO 256-QAM Clang 14 Clang 15 50 100 150 200 250 SE +/- 0.90, N = 3 SE +/- 0.46, N = 3 199.8 210.0 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.4 WAV To FLAC Clang 14 Clang 15 4 8 12 16 20 SE +/- 0.07, N = 5 SE +/- 0.03, N = 5 13.08 13.72 1. (CXX) g++ options: -O3 -march=native -flto -fvisibility=hidden -logg -lm
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 Clang 14 Clang 15 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.16, N = 3 45.40 43.37 1. (CC) gcc options: -O3 -march=native -flto -lz
OpenJPEG Encode: NASA Curiosity Panorama M34 OpenBenchmarking.org ms, Fewer Is Better OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 Clang 14 Clang 15 11K 22K 33K 44K 55K SE +/- 521.26, N = 5 SE +/- 295.03, N = 3 51408 49157 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic
POV-Ray Trace Time OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time Clang 14 Clang 15 5 10 15 20 25 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 22.77 21.81 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -flto -lSDL -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
AOBench Size: 2048 x 2048 - Total Time OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time Clang 14 Clang 15 7 14 21 28 35 SE +/- 0.07, N = 3 SE +/- 0.35, N = 4 29.75 28.56 1. (CC) gcc options: -lm -O3 -march=native -flto
srsRAN Test: 4G PHY_DL_Test 100 PRB MIMO 256-QAM OpenBenchmarking.org eNb Mb/s, More Is Better srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB MIMO 256-QAM Clang 14 Clang 15 130 260 390 520 650 SE +/- 1.59, N = 3 SE +/- 3.55, N = 3 557.4 579.4 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
srsRAN Test: 4G PHY_DL_Test 100 PRB SISO 256-QAM OpenBenchmarking.org UE Mb/s, More Is Better srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB SISO 256-QAM Clang 14 Clang 15 50 100 150 200 250 SE +/- 0.78, N = 3 SE +/- 0.15, N = 3 211.4 218.3 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K Clang 14 Clang 15 6 12 18 24 30 SE +/- 0.21, N = 3 SE +/- 0.13, N = 3 26.53 27.37 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread -lrt -ldl -lnuma
srsRAN Test: 5G PHY_DL_NR Test 52 PRB SISO 64-QAM OpenBenchmarking.org UE Mb/s, More Is Better srsRAN 22.04.1 Test: 5G PHY_DL_NR Test 52 PRB SISO 64-QAM Clang 14 Clang 15 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.35, N = 3 79.9 82.4 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 Clang 14 Clang 15 2 4 6 8 10 SE +/- 0.050, N = 3 SE +/- 0.047, N = 3 5.877 6.060 1. (CC) gcc options: -O3 -pipe -march=native -flto -lncurses -lm
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 Clang 14 Clang 15 800 1600 2400 3200 4000 SE +/- 24.48, N = 3 SE +/- 20.26, N = 3 3744.6 3848.9 1. (CXX) g++ options: -O3 -march=native -rdynamic
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.4 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K Clang 14 Clang 15 12 24 36 48 60 SE +/- 0.26, N = 3 SE +/- 0.60, N = 3 53.09 54.50 1. (CXX) g++ options: -O3 -march=native -flto -std=c++11 -U_FORTIFY_SOURCE -lm
KTX-Software toktx Settings: Zstd Compression 9 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 9 Clang 14 Clang 15 0.4862 0.9724 1.4586 1.9448 2.431 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 2.161 2.107
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 4K Clang 14 Clang 15 12 24 36 48 60 SE +/- 0.50, N = 15 SE +/- 0.45, N = 12 50.12 51.39 1. (CC) gcc options: -O3 -fcommon -march=native -flto -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed Clang 14 Clang 15 800 1600 2400 3200 4000 SE +/- 12.33, N = 3 SE +/- 4.56, N = 3 3957.9 3860.8 1. (CC) gcc options: -O3 -march=native -flto -pthread -lz -llzma
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS Clang 14 Clang 15 200 400 600 800 1000 SE +/- 14.83, N = 6 SE +/- 9.69, N = 9 761 778 1. (CXX) g++ options: -flto -O3 -march=native -pthread
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4K Clang 14 Clang 15 12 24 36 48 60 SE +/- 0.05, N = 3 SE +/- 0.09, N = 3 54.40 55.56 1. (CC) gcc options: -O3 -fcommon -march=native -flto -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance Clang 14 Clang 15 500K 1000K 1500K 2000K 2500K SE +/- 22581.52, N = 5 SE +/- 10751.06, N = 5 2239871 2194544 1. (CC) gcc options: -O3 -march=native -flto
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 Clang 14 Clang 15 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.09, N = 15 20.11 19.71 MIN: 19.33 / MAX: 21.9 MIN: 18.65 / MAX: 28.15 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.4 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K Clang 14 Clang 15 16 32 48 64 80 SE +/- 0.07, N = 3 SE +/- 0.90, N = 3 69.71 71.09 1. (CXX) g++ options: -O3 -march=native -flto -std=c++11 -U_FORTIFY_SOURCE -lm
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet Clang 14 Clang 15 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.08, N = 15 11.07 10.86 MIN: 10.23 / MAX: 12.75 MIN: 10.16 / MAX: 40.48 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
Nettle Test: sha512 OpenBenchmarking.org Mbyte/s, More Is Better Nettle 3.8 Test: sha512 Clang 14 Clang 15 160 320 480 640 800 SE +/- 0.19, N = 3 SE +/- 2.18, N = 3 737.49 751.68 1. (CC) gcc options: -O3 -march=native -flto -ggdb3 -lnettle -lm -lcrypto
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode Clang 14 Clang 15 1.2458 2.4916 3.7374 4.9832 6.229 SE +/- 0.020, N = 5 SE +/- 0.026, N = 5 5.433 5.537 1. (CXX) g++ options: -O3 -march=native -flto -logg -lm
OpenFOAM Input: motorBike - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: motorBike - Mesh Time Clang 14 Clang 15 9 18 27 36 45 37.06 37.75 -lfoamToVTK -llagrangian -lfileFormats -lfiniteVolume -lmeshTools 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m Clang 14 Clang 15 3 6 9 12 15 SE +/- 0.04, N = 2 SE +/- 0.06, N = 14 11.36 11.16 MIN: 11.22 / MAX: 11.87 MIN: 10.58 / MAX: 14.9 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark Clang 14 Clang 15 14 28 42 56 70 SE +/- 0.19, N = 3 SE +/- 0.18, N = 3 59.69 60.73 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -ljpeg -lz -lm
srsRAN Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM OpenBenchmarking.org eNb Mb/s, More Is Better srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM Clang 14 Clang 15 120 240 360 480 600 SE +/- 0.45, N = 3 SE +/- 0.15, N = 3 533.9 543.1 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
srsRAN Test: 4G PHY_DL_Test 100 PRB SISO 64-QAM OpenBenchmarking.org eNb Mb/s, More Is Better srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB SISO 64-QAM Clang 14 Clang 15 120 240 360 480 600 SE +/- 0.70, N = 3 SE +/- 0.75, N = 3 526.2 535.1 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
SVT-HEVC Tuning: 10 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 4K Clang 14 Clang 15 20 40 60 80 100 SE +/- 0.13, N = 3 SE +/- 0.05, N = 3 85.46 86.80 1. (CC) gcc options: -O3 -march=native -flto -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
AOM AV1 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.4 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K Clang 14 Clang 15 16 32 48 64 80 SE +/- 0.73, N = 3 SE +/- 0.37, N = 3 69.61 70.70 1. (CXX) g++ options: -O3 -march=native -flto -std=c++11 -U_FORTIFY_SOURCE -lm
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mobilenet Clang 14 Clang 15 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.07, N = 15 11.37 11.54 MIN: 10.82 / MAX: 12.01 MIN: 10.83 / MAX: 19.94 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
srsRAN Test: 5G PHY_DL_NR Test 52 PRB SISO 64-QAM OpenBenchmarking.org eNb Mb/s, More Is Better srsRAN 22.04.1 Test: 5G PHY_DL_NR Test 52 PRB SISO 64-QAM Clang 14 Clang 15 40 80 120 160 200 SE +/- 0.52, N = 3 SE +/- 0.97, N = 3 178.1 180.7 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
SVT-AV1 Encoder Mode: Preset 10 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.2 Encoder Mode: Preset 10 - Input: Bosphorus 4K Clang 14 Clang 15 20 40 60 80 100 SE +/- 0.16, N = 3 SE +/- 0.11, N = 3 82.26 81.12
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 Clang 14 Clang 15 200M 400M 600M 800M 1000M SE +/- 952773.73, N = 3 SE +/- 1479864.86, N = 3 1115233333 1100700000 1. (CC) gcc options: -O3 -march=native -flto -pthread -lm -lc -lliquid
SVT-HEVC Tuning: 7 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 4K Clang 14 Clang 15 14 28 42 56 70 SE +/- 0.66, N = 5 SE +/- 0.81, N = 3 64.72 63.89 1. (CC) gcc options: -O3 -march=native -flto -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
Nettle Test: chacha OpenBenchmarking.org Mbyte/s, More Is Better Nettle 3.8 Test: chacha Clang 14 Clang 15 300 600 900 1200 1500 SE +/- 2.17, N = 3 SE +/- 1.49, N = 3 1198.85 1214.34 MIN: 589.65 / MAX: 3386.11 MIN: 598.24 / MAX: 3435.95 1. (CC) gcc options: -O3 -march=native -flto -ggdb3 -lnettle -lm -lcrypto
srsRAN Test: 4G PHY_DL_Test 100 PRB SISO 256-QAM OpenBenchmarking.org eNb Mb/s, More Is Better srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB SISO 256-QAM Clang 14 Clang 15 120 240 360 480 600 SE +/- 0.33, N = 3 SE +/- 0.62, N = 3 563.4 570.1 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 Clang 14 Clang 15 15 30 45 60 75 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 64.57 65.32 MIN: 64.4 / MAX: 64.85 MIN: 64.97 / MAX: 65.75 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenFOAM Input: drivaerFastback, Small Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time Clang 14 Clang 15 80 160 240 320 400 371.08 375.34 -lfoamToVTK -llagrangian -lfileFormats -lfiniteVolume -lmeshTools 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
KTX-Software toktx Settings: UASTC 3 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 Clang 14 Clang 15 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 10.57 10.46
JPEG XL libjxl Input: JPEG - Encode Speed: 7 OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 Clang 14 Clang 15 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.03, N = 3 103.02 104.16 1. (CXX) g++ options: -O3 -march=native -flto -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie
C-Blosc Test: blosclz bitshuffle OpenBenchmarking.org MB/s, More Is Better C-Blosc 2.3 Test: blosclz bitshuffle Clang 14 Clang 15 3K 6K 9K 12K 15K SE +/- 37.67, N = 3 SE +/- 104.51, N = 3 14077.1 14221.4 1. (CXX) g++ options: -O3 -march=native -flto
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M Clang 14 Clang 15 1500 3000 4500 6000 7500 SE +/- 60.21, N = 3 SE +/- 40.85, N = 3 7012.9 7081.8 1. (CXX) g++ options: -O3 -march=native -flto -fexceptions -fno-rtti -maes -Ofast -funroll-loops -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenFOAM Input: motorBike - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: motorBike - Execution Time Clang 14 Clang 15 20 40 60 80 100 103.20 104.20 -lfoamToVTK -llagrangian -lfileFormats -lfiniteVolume -lmeshTools 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M Clang 14 Clang 15 3K 6K 9K 12K 15K SE +/- 32.20, N = 3 SE +/- 79.54, N = 3 13589.4 13469.2 1. (CXX) g++ options: -O3 -march=native -flto -fexceptions -fno-rtti -maes -Ofast -funroll-loops -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
srsRAN Test: 4G PHY_DL_Test 100 PRB SISO 64-QAM OpenBenchmarking.org UE Mb/s, More Is Better srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB SISO 64-QAM Clang 14 Clang 15 40 80 120 160 200 SE +/- 0.70, N = 3 SE +/- 5.69, N = 3 201.8 203.6 -latomic -ldl 1. (CXX) g++ options: -O3 -march=native -flto -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -lpthread -lm
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: HWB Color Space Clang 14 Clang 15 300 600 900 1200 1500 SE +/- 2.52, N = 3 SE +/- 4.04, N = 3 1405 1417 1. (CC) gcc options: -fopenmp -O3 -march=native -flto -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
OpenFOAM Input: drivaerFastback, Small Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time Clang 14 Clang 15 9 18 27 36 45 37.26 37.58 -lfoamToVTK -llagrangian -lfileFormats -lfiniteVolume -lmeshTools 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: alexnet Clang 14 Clang 15 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.03, N = 15 7.38 7.32 MIN: 7.09 / MAX: 8.55 MIN: 6.89 / MAX: 16.13 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
VOSK Speech Recognition Toolkit OpenBenchmarking.org Seconds, Fewer Is Better VOSK Speech Recognition Toolkit 0.3.21 Clang 14 Clang 15 4 8 12 16 20 SE +/- 0.16, N = 5 SE +/- 0.10, N = 13 14.46 14.35
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 Clang 14 Clang 15 70 140 210 280 350 SE +/- 0.21, N = 3 SE +/- 0.16, N = 3 304.93 302.64 MIN: 304.53 / MAX: 305.79 MIN: 302.22 / MAX: 303.12 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
Nettle Test: aes256 OpenBenchmarking.org Mbyte/s, More Is Better Nettle 3.8 Test: aes256 Clang 14 Clang 15 1600 3200 4800 6400 8000 SE +/- 2.05, N = 3 SE +/- 2.86, N = 3 7286.24 7339.72 MIN: 5504.06 / MAX: 10861.66 MIN: 5536.69 / MAX: 10939.29 1. (CC) gcc options: -O3 -march=native -flto -ggdb3 -lnettle -lm -lcrypto
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 8.0 Length: 1e12 Clang 14 Clang 15 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 10.54 10.61 1. (CXX) g++ options: -O3 -march=native -flto
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vgg16 Clang 14 Clang 15 11 22 33 44 55 SE +/- 0.32, N = 3 SE +/- 0.27, N = 15 46.68 46.99 MIN: 45.34 / MAX: 53.65 MIN: 44.38 / MAX: 56.74 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
Nettle Test: poly1305-aes OpenBenchmarking.org Mbyte/s, More Is Better Nettle 3.8 Test: poly1305-aes Clang 14 Clang 15 700 1400 2100 2800 3500 SE +/- 1.13, N = 3 SE +/- 2.21, N = 3 3398.23 3419.90 1. (CC) gcc options: -O3 -march=native -flto -ggdb3 -lnettle -lm -lcrypto
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer Clang 14 Clang 15 20 40 60 80 100 SE +/- 0.35, N = 3 SE +/- 0.20, N = 15 98.83 99.45 MIN: 97.58 / MAX: 104.02 MIN: 97.82 / MAX: 192.86 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 Clang 14 Clang 15 1.1093 2.2186 3.3279 4.4372 5.5465 SE +/- 0.03, N = 3 SE +/- 0.02, N = 15 4.93 4.90 MIN: 4.83 / MAX: 5.47 MIN: 4.71 / MAX: 15.1 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 Clang 14 Clang 15 70K 140K 210K 280K 350K SE +/- 17.46, N = 3 SE +/- 146.06, N = 3 316387.7 314488.2 1. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -flto -lssl -lcrypto -ldl
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet Clang 14 Clang 15 0.837 1.674 2.511 3.348 4.185 SE +/- 0.02, N = 3 SE +/- 0.02, N = 15 3.72 3.70 MIN: 3.63 / MAX: 4.33 MIN: 3.52 / MAX: 5.97 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Lion Clang 14 Clang 15 800 1600 2400 3200 4000 SE +/- 6.89, N = 3 SE +/- 5.46, N = 3 3909 3930 1. (CXX) g++ options: -O3 -march=native -flto
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: shufflenet-v2 Clang 14 Clang 15 0.8865 1.773 2.6595 3.546 4.4325 SE +/- 0.04, N = 3 SE +/- 0.02, N = 15 3.94 3.92 MIN: 3.8 / MAX: 4.39 MIN: 3.73 / MAX: 14 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
C-Blosc Test: blosclz shuffle OpenBenchmarking.org MB/s, More Is Better C-Blosc 2.3 Test: blosclz shuffle Clang 14 Clang 15 4K 8K 12K 16K 20K SE +/- 46.57, N = 3 SE +/- 96.37, N = 3 19848.8 19936.9 1. (CXX) g++ options: -O3 -march=native -flto
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.2 Encoder Mode: Preset 12 - Input: Bosphorus 4K Clang 14 Clang 15 20 40 60 80 100 SE +/- 0.11, N = 3 SE +/- 0.24, N = 3 104.37 103.91
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet18 Clang 14 Clang 15 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.10, N = 15 11.47 11.51 MIN: 10.93 / MAX: 12.71 MIN: 10.37 / MAX: 14.62 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 Clang 14 Clang 15 1000 2000 3000 4000 5000 SE +/- 2.20, N = 3 SE +/- 2.63, N = 3 4806.6 4790.2 1. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -flto -lssl -lcrypto -ldl
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed Clang 14 Clang 15 1100 2200 3300 4400 5500 SE +/- 6.01, N = 3 SE +/- 14.64, N = 3 5138.6 5121.1 1. (CC) gcc options: -O3 -march=native -flto -pthread -lz -llzma
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP Clang 14 Clang 15 6K 12K 18K 24K 30K SE +/- 372.37, N = 3 SE +/- 284.62, N = 5 26297.79 26383.03 1. (CXX) g++ options: -O3 -march=native -fopenmp
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.2 Encoder Mode: Preset 8 - Input: Bosphorus 4K Clang 14 Clang 15 11 22 33 44 55 SE +/- 0.34, N = 3 SE +/- 0.45, N = 3 48.51 48.38
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 Clang 14 Clang 15 0.9 1.8 2.7 3.6 4.5 SE +/- 0.05, N = 3 SE +/- 0.03, N = 15 3.99 4.00 MIN: 3.8 / MAX: 4.54 MIN: 3.71 / MAX: 13.25 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
KTX-Software toktx Settings: UASTC 3 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 Clang 14 Clang 15 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.009, N = 3 6.502 6.486
KTX-Software toktx Settings: UASTC 4 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 4 + Zstd Compression 19 Clang 14 Clang 15 40 80 120 160 200 SE +/- 0.20, N = 3 SE +/- 0.25, N = 3 158.34 158.72
FinanceBench Benchmark: Bonds OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP Clang 14 Clang 15 9K 18K 27K 36K 45K SE +/- 20.59, N = 3 SE +/- 154.34, N = 3 40301.00 40390.20 1. (CXX) g++ options: -O3 -march=native -fopenmp
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed Clang 14 Clang 15 11 22 33 44 55 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 50.2 50.3 1. (CC) gcc options: -O3 -march=native -flto -pthread -lz -llzma
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Exhaustive Clang 14 Clang 15 0.2979 0.5958 0.8937 1.1916 1.4895 SE +/- 0.0011, N = 3 SE +/- 0.0029, N = 3 1.3239 1.3214 1. (CXX) g++ options: -O3 -march=native -flto -pthread
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Medium Clang 14 Clang 15 20 40 60 80 100 SE +/- 0.18, N = 3 SE +/- 0.12, N = 3 103.37 103.54 1. (CXX) g++ options: -O3 -march=native -flto -pthread
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth Clang 14 Clang 15 13M 26M 39M 52M 65M SE +/- 746401.00, N = 3 SE +/- 68853.93, N = 3 59241934 59326632
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Clang 14 Clang 15 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 26.00 25.97 1. (CC) gcc options: -lm -lpthread -O3 -march=native -flto
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough Clang 14 Clang 15 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 12.81 12.79 1. (CXX) g++ options: -O3 -march=native -flto -pthread
KTX-Software toktx Settings: Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 19 Clang 14 Clang 15 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 14.51 14.51
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 Clang 14 Clang 15 0.792 1.584 2.376 3.168 3.96 SE +/- 0.02, N = 3 SE +/- 0.02, N = 15 3.52 3.52 MIN: 3.44 / MAX: 3.99 MIN: 3.33 / MAX: 11.49 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed Clang 14 Clang 15 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 36.3 36.3 1. (CC) gcc options: -O3 -march=native -flto -pthread -lz -llzma
Phoronix Test Suite v10.8.5