Apple M2 compiler benchmarks for a future article by Michael Larabel.
Clang Processor: Apple M2 @ 2.42GHz (4 Cores / 8 Threads), Motherboard: Apple MacBook Air (13 h M2 2022), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z, Graphics: llvmpipe, Network: Broadcom Device 4433 + Broadcom Device 5f71
OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600
Environment Notes: CFLAGS=-O3Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
GCC OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: GCC 12.1.0 + Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600
Environment Notes: CFLAGS=-O3Compiler Notes: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++ --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-linker-hash-style=gnuProcessor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
apple m2 compilers OpenBenchmarking.org Phoronix Test Suite Apple M2 @ 2.42GHz (4 Cores / 8 Threads) Apple MacBook Air (13 h M2 2022) 8GB 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z llvmpipe Broadcom Device 4433 + Broadcom Device 5f71 Arch rolling 5.19.0-rc7-asahi-2-1-ARCH (aarch64) KDE Plasma 5.25.4 X Server 1.21.1.4 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits) Clang 14.0.6 GCC 12.1.0 + Clang 14.0.6 ext4 2560x1600 Processor Motherboard Memory Disk Graphics Network OS Kernel Desktop Display Server OpenGL Compilers File-System Screen Resolution Apple M2 Compilers Performance System Logs - CFLAGS=-O3 - Scaling Governor: apple-cpufreq schedutil - Python 3.10.5 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected - GCC: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++ --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-linker-hash-style=gnu
Clang vs. GCC Comparison Phoronix Test Suite Baseline +29.2% +29.2% +58.4% +58.4% +87.6% +87.6% 116.6% 96.6% 96% 88.4% 71.5% 70.6% 64% 63.8% 63% 62.8% 60.7% 57.9% 57.7% 54.8% 47.7% 46.2% 43.3% 41.7% 39.5% 39.1% 37.6% 35.5% 33.4% 30.3% 25.9% 22.6% 20.9% 19.9% 14.8% 14.5% 13.9% 13.8% 13.7% 12.8% 12.8% 12.4% 11.5% 10.3% 10% 9.6% 9.6% 9.1% 9% 8.6% 8.4% 6.8% 6.2% 5.2% 5.2% 5% 4.3% 4% 3.7% 3.6% 3.5% 3.5% 3.3% 3% 2.6% 2.6% 2.1% CPU - FastestDet CPU - regnety_400m CPU - regnety_400m CPU - resnet50 CPU - efficientnet-b0 PNG - 5 71.2% CPU - efficientnet-b0 Unkeyed Algorithms CPU-v2-v2 - mobilenet-v2 CPU - shufflenet-v2 CPU - mnasnet CPU-v3-v3 - mobilenet-v3 CPU - shufflenet-v2 CPU-v2-v2 - mobilenet-v2 CPU - mnasnet CPU-v3-v3 - mobilenet-v3 Keyed Algorithms HWB Color Space CPU - DenseNet CPU - yolov4-tiny CPU - MobileNet v2 CPU - yolov4-tiny Noise-Gaussian D.L.M.F 34.8% PNG - 7 34.2% WAV To MP3 CPU - squeezenet_ssd C2670 30.1% CPU - resnet18 27.3% 2 - 256 - 57 27% 1 - 256 - 57 26.9% 4 - 256 - 57 26.7% CPU - mobilenet P.P.A 24.8% CPU - blazeface Total Time - 4.1.R.P.P T.T.S.S CPU - resnet18 19.3% C7552 17.8% CPU - googlenet 17.5% Composite 17% 19 - D.S 2048 x 2048 - Total Time CPU - googlenet 14.1% CPU - squeezenet_ssd 13.9% CPU - vgg16 13.9% 8 - 256 - 57 13.9% 3 - D.S CoreMark Size 666 - I.P.S Resizing 13.7% 8 - D.S JPEG - 5 12.9% 8, Long Mode - D.S 3, Long Mode - D.S JPEG - 7 12.6% F.F.T WAV To FLAC P.P.S 11.4% KASUMI 11.3% I.E.C.P.K.A 11% Enhanced CPU - blazeface 19, Long Mode - D.S Summer Nature 4K KASUMI - Decrypt 9.2% S.M.M 9.1% UASTC Level 0 CPU - mobilenet CPU - vision_transformer ETC1S CPU - alexnet 8.3% Monero - 1M 8.2% Throughput CPU - SqueezeNet v2 DistinctUserID 6.2% SHA256 6.1% Wownero - 1M 5.3% C.1.1.b 5.2% Sharpen UASTC Level 2 Blowfish 5.1% CPU - vgg16 CPU - alexnet 5% Blowfish - Decrypt 5% 19, Long Mode - Compression Speed TopTweet 4.2% ChaCha20Poly1305 - Decrypt 4.2% PartialTweets 4.1% LargeRand Rhodopsin Protein 3.8% D.T ChaCha20Poly1305 3.7% 8, Long Mode - Compression Speed Swirl Chimera 1080p 3.5% 19 - Compression Speed 1e12 Monte Carlo Twofish 2.9% Trace Time 2.8% 3, Long Mode - Compression Speed 2.6% CPU - SqueezeNet v1.1 WAV To Opus Encode WAV To WavPack 2.5% CPU - resnet50 NCNN NCNN NCNN NCNN NCNN JPEG XL libjxl NCNN Crypto++ NCNN NCNN NCNN NCNN NCNN NCNN NCNN NCNN Crypto++ GraphicsMagick TNN NCNN TNN NCNN GraphicsMagick SciMark JPEG XL libjxl LAME MP3 Encoding NCNN Ngspice NCNN Liquid-DSP Liquid-DSP Liquid-DSP NCNN Timed MrBayes Analysis NCNN C-Ray eSpeak-NG Speech Engine NCNN Ngspice NCNN SciMark Zstd Compression AOBench NCNN NCNN NCNN Liquid-DSP Zstd Compression Coremark GraphicsMagick Zstd Compression JPEG XL libjxl Zstd Compression Zstd Compression JPEG XL libjxl SciMark FLAC Audio Encoding Himeno Benchmark Botan Crypto++ GraphicsMagick NCNN Zstd Compression dav1d Botan SciMark Basis Universal NCNN NCNN Basis Universal NCNN Xmrig Sockperf TNN simdjson OpenSSL Xmrig libgav1 GraphicsMagick Basis Universal Botan NCNN NCNN Botan Zstd Compression simdjson Botan simdjson simdjson LAMMPS Molecular Dynamics Simulator libjpeg-turbo tjbench Botan Zstd Compression GraphicsMagick libgav1 Zstd Compression Primesieve SciMark Botan POV-Ray Zstd Compression TNN Opus Codec Encoding WavPack Audio Encoding NCNN Clang GCC
apple m2 compilers aobench: 2048 x 2048 - Total Time basis: ETC1S basis: UASTC Level 0 basis: UASTC Level 2 basis: UASTC Level 3 botan: KASUMI botan: KASUMI - Decrypt botan: Twofish botan: Twofish - Decrypt botan: Blowfish botan: Blowfish - Decrypt botan: CAST-256 botan: CAST-256 - Decrypt botan: ChaCha20Poly1305 botan: ChaCha20Poly1305 - Decrypt c-ray: Total Time - 4K, 16 Rays Per Pixel coremark: CoreMark Size 666 - Iterations Per Second cryptopp: Keyed Algorithms cryptopp: Unkeyed Algorithms cryptopp: Integer + Elliptic Curve Public Key Algorithms dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit espeak: Text-To-Speech Synthesis encode-flac: WAV To FLAC gcrypt: gnupg: 2.7GB Sample File Encryption draco: Lion draco: Church Facade graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space himeno: Poisson Pressure Solver jpegxl: PNG - 5 jpegxl: PNG - 7 jpegxl: JPEG - 5 jpegxl: JPEG - 7 jpegxl: JPEG - 8 encode-mp3: WAV To MP3 lammps: Rhodopsin Protein libgav1: Chimera 1080p libgav1: Summer Nature 4K libgav1: Summer Nature 1080p libgav1: Chimera 1080p 10-bit tjbench: Decompression Throughput liquid-dsp: 1 - 256 - 57 liquid-dsp: 2 - 256 - 57 liquid-dsp: 4 - 256 - 57 liquid-dsp: 8 - 256 - 57 luajit: Composite luajit: Monte Carlo luajit: Fast Fourier Transform luajit: Sparse Matrix Multiply luajit: Dense LU Matrix Factorization luajit: Jacobi Successive Over-Relaxation ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - mnasnet ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet ngspice: C2670 ngspice: C7552 openjpeg: NASA Curiosity Panorama M34 openssl: SHA256 openssl: RSA4096 openssl: RSA4096 encode-opus: WAV To Opus Encode povray: Trace Time primesieve: 1e12 scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation simdjson: Kostya simdjson: TopTweet simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID sockperf: Throughput sqlite-speedtest: Timed Time - Size 1,000 mrbayes: Primate Phylogeny Analysis tnn: CPU - DenseNet tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v2 tnn: CPU - SqueezeNet v1.1 encode-wavpack: WAV To WavPack xmrig: Monero - 1M xmrig: Wownero - 1M compress-zstd: 3 - Compression Speed compress-zstd: 3 - Decompression Speed compress-zstd: 8 - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 3, Long Mode - Compression Speed compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed Clang GCC 29.475 26.163 6.548 35.881 70.156 93.794 91.826 351.494 347.604 436.062 436.816 136.758 136.985 578.630 570.921 95.542 175753.764225 404.764881 360.175963 2161.313491 376.77 105.09 527.87 283.70 20.297 41.519 255.913 43.850 3433 5044 311 1571 97 155 613 121 950 8507.407421 35.58 13.12 114.37 114.98 38.65 7.573 3.440 161.84 55.66 133.16 81.34 214.698638 38463333 76895333 153386667 190544667 1387.75 424.27 560.45 1902.44 3157.76 893.81 15.03 3.47 3.31 3.13 6.55 2.28 11.03 29.59 6.16 11.15 15.69 20.18 12.64 10.38 3.63 13.07 3.62 3.25 3.00 3.53 5.56 2.09 10.83 34.44 5.90 11.80 28.71 20.24 12.09 10.17 544.38 4.05 76.926 56.097 48724 9058229720 1529.5 107954.2 14.665 88.926 40.366 2870.78 450.23 500.91 4508.82 6907.38 1986.56 3.03 4.43 1.00 4.35 4.45 793185 45.453 192.753 7411.322 426.557 56.613 330.089 17.978 2520.6 2676.2 3525.2 4234.8 880.2 4385.7 25.9 3932.4 272.4 4647.8 691.9 4854.7 21.0 4146.6 25.751 24.130 6.003 34.123 70.001 84.297 84.114 341.749 348.144 415.066 416.060 136.749 137.213 558.134 547.817 79.027 199947.689206 591.620305 590.715697 1947.622786 375.87 115.13 534.01 280.98 16.933 37.251 258.393 43.891 3476 5080 322 1586 102 171 539 164 1361 7634.591034 20.78 9.78 101.26 102.13 38.16 5.678 3.315 156.32 55.13 131.35 77.30 222.666380 30318000 60534667 121106667 167353333 1386.16 429.71 560.21 1890.85 3156.17 893.84 11.94 2.20 2.06 1.92 3.82 1.86 12.59 33.70 7.84 12.08 15.36 14.67 14.40 5.28 2.23 11.99 2.21 2.20 1.90 2.28 3.26 1.90 12.72 32.80 7.04 12.39 15.24 14.51 9.28 5.19 501.17 1.87 100.084 66.061 48829 8534811807 1520.3 106464.4 14.298 91.456 39.095 2454.22 463.86 563.13 4132.95 5124.15 1987.03 3.05 4.25 1.04 4.18 4.19 847339 44.635 240.566 5229.211 306.614 53.284 321.825 18.420 2329.9 2541.9 3526.6 4821.4 866.1 4986.0 26.8 4513.1 265.4 5241.1 716.9 5478.3 21.9 4543.4 OpenBenchmarking.org
Basis Universal Basis Universal is a GPU texture codec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S Clang GCC 6 12 18 24 30 SE +/- 0.20, N = 3 SE +/- 0.11, N = 3 26.16 24.13 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 Clang GCC 2 4 6 8 10 SE +/- 0.012, N = 3 SE +/- 0.007, N = 3 6.548 6.003 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 Clang GCC 8 16 24 32 40 SE +/- 0.24, N = 3 SE +/- 0.14, N = 3 35.88 34.12 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 Clang GCC 16 32 48 64 80 SE +/- 0.68, N = 3 SE +/- 0.74, N = 3 70.16 70.00 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI Clang GCC 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 93.79 84.30 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt Clang GCC 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 91.83 84.11 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish Clang GCC 80 160 240 320 400 SE +/- 0.04, N = 3 SE +/- 0.11, N = 3 351.49 341.75 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt Clang GCC 80 160 240 320 400 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 347.60 348.14 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish Clang GCC 90 180 270 360 450 SE +/- 0.13, N = 3 SE +/- 0.06, N = 3 436.06 415.07 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt Clang GCC 90 180 270 360 450 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 436.82 416.06 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 Clang GCC 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 136.76 136.75 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt Clang GCC 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 136.99 137.21 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 Clang GCC 130 260 390 520 650 SE +/- 0.13, N = 3 SE +/- 0.08, N = 3 578.63 558.13 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt Clang GCC 120 240 360 480 600 SE +/- 0.16, N = 3 SE +/- 0.29, N = 3 570.92 547.82 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Clang GCC 20 40 60 80 100 SE +/- 1.77, N = 15 SE +/- 0.60, N = 15 95.54 79.03 1. (CC) gcc options: -lm -lpthread -O3
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Unkeyed Algorithms Clang GCC 130 260 390 520 650 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 360.18 590.72 1. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Integer + Elliptic Curve Public Key Algorithms Clang GCC 500 1000 1500 2000 2500 SE +/- 0.40, N = 3 SE +/- 0.86, N = 3 2161.31 1947.62 1. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe
OpenBenchmarking.org FPS, More Is Better dav1d 1.0 Video Input: Summer Nature 4K Clang GCC 30 60 90 120 150 SE +/- 2.69, N = 15 SE +/- 1.12, N = 3 105.09 115.13 1. (CC) gcc options: -O3 -pthread -lm
OpenBenchmarking.org FPS, More Is Better dav1d 1.0 Video Input: Summer Nature 1080p Clang GCC 120 240 360 480 600 SE +/- 2.72, N = 3 SE +/- 0.68, N = 3 527.87 534.01 1. (CC) gcc options: -O3 -pthread -lm
OpenBenchmarking.org FPS, More Is Better dav1d 1.0 Video Input: Chimera 1080p 10-bit Clang GCC 60 120 180 240 300 SE +/- 4.07, N = 3 SE +/- 2.29, N = 9 283.70 280.98 1. (CC) gcc options: -O3 -pthread -lm
Gcrypt Library Libgcrypt is a general purpose cryptographic library developed as part of the GnuPG project. This is a benchmark of libgcrypt's integrated benchmark and is measuring the time to run the benchmark command with a cipher/mac/hash repetition count set for 50 times as simple, high level look at the overall crypto performance of the system under test. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 Clang GCC 60 120 180 240 300 SE +/- 0.53, N = 3 SE +/- 0.56, N = 3 255.91 258.39 1. (CC) gcc options: -O3 -fvisibility=hidden -lgpg-error
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Lion Clang GCC 700 1400 2100 2800 3500 SE +/- 2.67, N = 3 SE +/- 8.25, N = 3 3433 3476 1. (CXX) g++ options: -O3
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl Clang GCC 70 140 210 280 350 SE +/- 6.11, N = 15 SE +/- 4.72, N = 15 311 322 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate Clang GCC 300 600 900 1200 1500 SE +/- 11.02, N = 3 SE +/- 2.60, N = 3 1571 1586 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen Clang GCC 20 40 60 80 100 SE +/- 1.14, N = 15 SE +/- 1.24, N = 15 97 102 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced Clang GCC 40 80 120 160 200 SE +/- 1.66, N = 15 SE +/- 1.86, N = 3 155 171 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing Clang GCC 130 260 390 520 650 SE +/- 6.24, N = 3 SE +/- 4.56, N = 15 613 539 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian Clang GCC 40 80 120 160 200 SE +/- 1.33, N = 15 SE +/- 2.09, N = 15 121 164 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space Clang GCC 300 600 900 1200 1500 SE +/- 12.49, N = 15 SE +/- 18.02, N = 3 950 1361 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 5 Clang GCC 8 16 24 32 40 SE +/- 0.36, N = 15 SE +/- 0.20, N = 15 35.58 20.78 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 7 Clang GCC 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 13.12 9.78 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 5 Clang GCC 30 60 90 120 150 SE +/- 0.68, N = 3 SE +/- 0.28, N = 3 114.37 101.26 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 Clang GCC 30 60 90 120 150 SE +/- 0.49, N = 3 SE +/- 0.55, N = 3 114.98 102.13 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 Clang GCC 9 18 27 36 45 SE +/- 0.24, N = 3 SE +/- 0.19, N = 3 38.65 38.16 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
OpenBenchmarking.org FPS, More Is Better libgav1 0.17 Video Input: Summer Nature 4K Clang GCC 13 26 39 52 65 SE +/- 0.16, N = 3 SE +/- 0.21, N = 3 55.66 55.13 1. (CXX) g++ options: -O3 -lrt
OpenBenchmarking.org FPS, More Is Better libgav1 0.17 Video Input: Summer Nature 1080p Clang GCC 30 60 90 120 150 SE +/- 1.30, N = 3 SE +/- 0.29, N = 3 133.16 131.35 1. (CXX) g++ options: -O3 -lrt
OpenBenchmarking.org FPS, More Is Better libgav1 0.17 Video Input: Chimera 1080p 10-bit Clang GCC 20 40 60 80 100 SE +/- 0.49, N = 3 SE +/- 1.03, N = 3 81.34 77.30 1. (CXX) g++ options: -O3 -lrt
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 Clang GCC 8M 16M 24M 32M 40M SE +/- 4977.73, N = 3 SE +/- 3511.88, N = 3 38463333 30318000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 Clang GCC 16M 32M 48M 64M 80M SE +/- 2403.70, N = 3 SE +/- 4910.31, N = 3 76895333 60534667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 Clang GCC 30M 60M 90M 120M 150M SE +/- 133832.40, N = 3 SE +/- 8819.17, N = 3 153386667 121106667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 Clang GCC 40M 80M 120M 160M 200M SE +/- 1293649.64, N = 15 SE +/- 521674.65, N = 3 190544667 167353333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Monte Carlo Clang GCC 90 180 270 360 450 SE +/- 0.17, N = 3 SE +/- 2.66, N = 3 424.27 429.71 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Fast Fourier Transform Clang GCC 120 240 360 480 600 SE +/- 0.91, N = 3 SE +/- 0.16, N = 3 560.45 560.21 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Sparse Matrix Multiply Clang GCC 400 800 1200 1600 2000 SE +/- 1.66, N = 3 SE +/- 13.70, N = 3 1902.44 1890.85 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Dense LU Matrix Factorization Clang GCC 700 1400 2100 2800 3500 SE +/- 0.85, N = 3 SE +/- 7.46, N = 3 3157.76 3156.17 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation Clang GCC 200 400 600 800 1000 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 893.81 893.84 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mobilenet Clang GCC 4 8 12 16 20 SE +/- 1.12, N = 13 SE +/- 0.15, N = 4 15.03 11.94 -lomp - MIN: 10.47 / MAX: 34.45 -lgomp - MIN: 8.2 / MAX: 27.45 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 Clang GCC 0.7808 1.5616 2.3424 3.1232 3.904 SE +/- 0.06, N = 13 SE +/- 0.01, N = 4 3.47 2.20 -lomp - MIN: 3.08 / MAX: 5.31 -lgomp - MIN: 2.17 / MAX: 2.4 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 Clang GCC 0.7448 1.4896 2.2344 2.9792 3.724 SE +/- 0.08, N = 13 SE +/- 0.01, N = 4 3.31 2.06 -lomp - MIN: 2.56 / MAX: 7.12 -lgomp - MIN: 2.03 / MAX: 2.17 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: shufflenet-v2 Clang GCC 0.7043 1.4086 2.1129 2.8172 3.5215 SE +/- 0.07, N = 13 SE +/- 0.00, N = 4 3.13 1.92 -lomp - MIN: 2.48 / MAX: 4.66 -lgomp - MIN: 1.9 / MAX: 2.28 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: efficientnet-b0 Clang GCC 2 4 6 8 10 SE +/- 0.15, N = 13 SE +/- 0.20, N = 4 6.55 3.82 -lomp - MIN: 5.13 / MAX: 7.99 -lgomp - MIN: 3.55 / MAX: 25.76 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: blazeface Clang GCC 0.513 1.026 1.539 2.052 2.565 SE +/- 0.12, N = 13 SE +/- 0.04, N = 4 2.28 1.86 -lomp - MIN: 1.38 / MAX: 5.9 -lgomp - MIN: 1.02 / MAX: 8.5 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: googlenet Clang GCC 3 6 9 12 15 SE +/- 0.26, N = 13 SE +/- 0.06, N = 4 11.03 12.59 -lomp - MIN: 9.1 / MAX: 21.81 -lgomp - MIN: 8.3 / MAX: 21.52 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: vgg16 Clang GCC 8 16 24 32 40 SE +/- 0.34, N = 13 SE +/- 0.10, N = 4 29.59 33.70 -lomp - MIN: 27.88 / MAX: 48.18 -lgomp - MIN: 28.28 / MAX: 51.28 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet18 Clang GCC 2 4 6 8 10 SE +/- 0.07, N = 13 SE +/- 0.12, N = 4 6.16 7.84 -lomp - MIN: 5.6 / MAX: 7.49 -lgomp - MIN: 5.48 / MAX: 21.03 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: alexnet Clang GCC 3 6 9 12 15 SE +/- 0.31, N = 13 SE +/- 0.11, N = 4 11.15 12.08 -lomp - MIN: 10.07 / MAX: 22.61 -lgomp - MIN: 9.11 / MAX: 22.77 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet50 Clang GCC 4 8 12 16 20 SE +/- 0.48, N = 13 SE +/- 0.12, N = 4 15.69 15.36 -lomp - MIN: 13.48 / MAX: 29.95 -lgomp - MIN: 13.43 / MAX: 24.89 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: yolov4-tiny Clang GCC 5 10 15 20 25 SE +/- 0.57, N = 13 SE +/- 0.31, N = 4 20.18 14.67 -lomp - MIN: 15.44 / MAX: 33.23 -lgomp - MIN: 12.77 / MAX: 26.02 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: squeezenet_ssd Clang GCC 4 8 12 16 20 SE +/- 0.17, N = 13 SE +/- 0.06, N = 4 12.64 14.40 -lomp - MIN: 9.67 / MAX: 27.34 -lgomp - MIN: 9.79 / MAX: 29.22 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: regnety_400m Clang GCC 3 6 9 12 15 SE +/- 0.26, N = 13 SE +/- 0.04, N = 4 10.38 5.28 -lomp - MIN: 8.02 / MAX: 15.57 -lgomp - MIN: 5.16 / MAX: 5.43 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mnasnet Clang GCC 0.8168 1.6336 2.4504 3.2672 4.084 SE +/- 0.08, N = 13 3.63 2.23 -lomp - MIN: 2.94 / MAX: 4.76 -lgomp - MIN: 2.22 / MAX: 2.33 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mobilenet Clang GCC 3 6 9 12 15 SE +/- 0.34, N = 12 SE +/- 0.15, N = 3 13.07 11.99 -lomp - MIN: 9.68 / MAX: 25.4 -lgomp - MIN: 8.33 / MAX: 23.04 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 Clang GCC 0.8145 1.629 2.4435 3.258 4.0725 SE +/- 0.27, N = 12 SE +/- 0.08, N = 3 3.62 2.21 -lomp - MIN: 2.9 / MAX: 8.87 -lgomp - MIN: 2.09 / MAX: 5.32 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 Clang GCC 0.7313 1.4626 2.1939 2.9252 3.6565 SE +/- 0.11, N = 12 SE +/- 0.03, N = 3 3.25 2.20 -lomp - MIN: 2.49 / MAX: 9.1 -lgomp - MIN: 2.08 / MAX: 5.17 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: shufflenet-v2 Clang GCC 0.675 1.35 2.025 2.7 3.375 SE +/- 0.07, N = 12 SE +/- 0.05, N = 3 3.00 1.90 -lomp - MIN: 2.11 / MAX: 12.45 -lgomp - MIN: 1.82 / MAX: 5.07 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet Clang GCC 0.7943 1.5886 2.3829 3.1772 3.9715 SE +/- 0.07, N = 12 SE +/- 0.09, N = 3 3.53 2.28 -lomp - MIN: 2.72 / MAX: 7.36 -lgomp - MIN: 2.11 / MAX: 5.51 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 Clang GCC 1.251 2.502 3.753 5.004 6.255 SE +/- 0.11, N = 12 SE +/- 0.02, N = 3 5.56 3.26 -lomp - MIN: 4.41 / MAX: 15.78 -lgomp - MIN: 3.21 / MAX: 12.85 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: blazeface Clang GCC 0.4703 0.9406 1.4109 1.8812 2.3515 SE +/- 0.06, N = 12 SE +/- 0.05, N = 3 2.09 1.90 -lomp - MIN: 1.37 / MAX: 3.21 -lgomp - MIN: 0.98 / MAX: 7.72 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet Clang GCC 3 6 9 12 15 SE +/- 0.19, N = 12 SE +/- 0.07, N = 3 10.83 12.72 -lomp - MIN: 8.82 / MAX: 22.43 -lgomp - MIN: 8.11 / MAX: 27.99 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vgg16 Clang GCC 8 16 24 32 40 SE +/- 1.89, N = 12 SE +/- 0.02, N = 3 34.44 32.80 -lomp - MIN: 27.08 / MAX: 74.43 -lgomp - MIN: 27.43 / MAX: 42.65 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet18 Clang GCC 2 4 6 8 10 SE +/- 0.05, N = 12 SE +/- 0.18, N = 3 5.90 7.04 -lomp - MIN: 5.32 / MAX: 7.13 -lgomp - MIN: 5.13 / MAX: 19.31 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: alexnet Clang GCC 3 6 9 12 15 SE +/- 0.48, N = 12 SE +/- 0.12, N = 3 11.80 12.39 -lomp - MIN: 9.79 / MAX: 24.49 -lgomp - MIN: 8.84 / MAX: 22.79 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 Clang GCC 7 14 21 28 35 SE +/- 3.28, N = 12 SE +/- 0.03, N = 3 28.71 15.24 -lomp - MIN: 13.32 / MAX: 216.5 -lgomp - MIN: 13.38 / MAX: 25.59 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: yolov4-tiny Clang GCC 5 10 15 20 25 SE +/- 1.95, N = 12 SE +/- 0.07, N = 3 20.24 14.51 -lomp - MIN: 13.87 / MAX: 70.08 -lgomp - MIN: 12.49 / MAX: 23.48 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: squeezenet_ssd Clang GCC 3 6 9 12 15 SE +/- 1.64, N = 12 SE +/- 0.10, N = 3 12.09 9.28 -lomp - MIN: 7.83 / MAX: 31.07 -lgomp - MIN: 7.07 / MAX: 20.52 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m Clang GCC 3 6 9 12 15 SE +/- 0.41, N = 12 SE +/- 0.03, N = 3 10.17 5.19 -lomp - MIN: 7.32 / MAX: 18.84 -lgomp - MIN: 5.13 / MAX: 9.15 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer Clang GCC 120 240 360 480 600 SE +/- 10.55, N = 12 SE +/- 0.71, N = 3 544.38 501.17 -lomp - MIN: 375.2 / MAX: 1425.41 -lgomp - MIN: 475.83 / MAX: 544.85 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: FastestDet Clang GCC 0.9113 1.8226 2.7339 3.6452 4.5565 SE +/- 0.40, N = 11 SE +/- 0.11, N = 3 4.05 1.87 -lomp - MIN: 2.43 / MAX: 8.08 -lgomp - MIN: 1.69 / MAX: 12.73 1. (CXX) g++ options: -O3 -rdynamic -lpthread
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 Clang GCC 20 40 60 80 100 SE +/- 0.28, N = 3 SE +/- 3.54, N = 15 76.93 100.08 1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 Clang GCC 15 30 45 60 75 SE +/- 0.24, N = 3 SE +/- 1.02, N = 15 56.10 66.06 1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE
OpenJPEG OpenJPEG is an open-source JPEG 2000 codec written in the C programming language. The default input for this test profile is the NASA/JPL-Caltech/MSSS Curiosity panorama 717MB TIFF image file converting to JPEG2000 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 Clang GCC 10K 20K 30K 40K 50K SE +/- 90.00, N = 3 SE +/- 145.89, N = 3 48724 48829 1. (CXX) g++ options: -rdynamic
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 Clang GCC 2000M 4000M 6000M 8000M 10000M SE +/- 10759857.48, N = 3 SE +/- 13180414.49, N = 3 9058229720 8534811807 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 Clang GCC 300 600 900 1200 1500 SE +/- 10.91, N = 3 SE +/- 15.18, N = 3 1529.5 1520.3 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 Clang GCC 20K 40K 60K 80K 100K SE +/- 712.36, N = 3 SE +/- 1045.44, N = 3 107954.2 106464.4 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode Clang GCC 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 14.67 14.30 -fvisibility=hidden 1. (CXX) g++ options: -logg -lm
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time Clang GCC 20 40 60 80 100 SE +/- 0.91, N = 4 SE +/- 1.04, N = 4 88.93 91.46 -R/usr/lib 1. (CXX) g++ options: -pipe -O3 -ffast-math -lSDL -lXpm -lSM -lICE -lX11 -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite Clang GCC 600 1200 1800 2400 3000 SE +/- 3.27, N = 3 SE +/- 3.41, N = 3 2870.78 2454.22 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo Clang GCC 100 200 300 400 500 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 450.23 463.86 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform Clang GCC 120 240 360 480 600 SE +/- 0.71, N = 3 SE +/- 0.57, N = 3 500.91 563.13 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply Clang GCC 1000 2000 3000 4000 5000 SE +/- 16.69, N = 3 SE +/- 23.87, N = 3 4508.82 4132.95 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization Clang GCC 1500 3000 4500 6000 7500 SE +/- 8.30, N = 3 SE +/- 6.68, N = 3 6907.38 5124.15 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Clang GCC 400 800 1200 1600 2000 SE +/- 0.12, N = 3 SE +/- 0.43, N = 3 1986.56 1987.03 1. (CC) gcc options: -O3 -lm
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: Kostya Clang GCC 0.6863 1.3726 2.0589 2.7452 3.4315 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.03 3.05 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: TopTweet Clang GCC 0.9968 1.9936 2.9904 3.9872 4.984 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.43 4.25 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: LargeRandom Clang GCC 0.234 0.468 0.702 0.936 1.17 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.00 1.04 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: PartialTweets Clang GCC 0.9788 1.9576 2.9364 3.9152 4.894 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.35 4.18 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: DistinctUserID Clang GCC 1.0013 2.0026 3.0039 4.0052 5.0065 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.45 4.19 1. (CXX) g++ options: -O3
Sockperf This is a network socket API performance benchmark developed by Mellanox. This test profile runs both the client and server on the local host for evaluating individual system performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Messages Per Second, More Is Better Sockperf 3.7 Test: Throughput Clang GCC 200K 400K 600K 800K 1000K SE +/- 6642.87, N = 5 SE +/- 3804.35, N = 5 793185 847339 1. (CXX) g++ options: --param -O3 -rdynamic
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet Clang GCC 1600 3200 4800 6400 8000 SE +/- 63.12, N = 8 SE +/- 2.99, N = 3 7411.32 5229.21 -fopenmp=libomp - MIN: 5728.98 / MAX: 13889.4 -fopenmp - MIN: 5076.63 / MAX: 5318.22 1. (CXX) g++ options: -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 Clang GCC 90 180 270 360 450 SE +/- 3.88, N = 7 SE +/- 0.23, N = 3 426.56 306.61 -fopenmp=libomp - MIN: 350.24 / MAX: 452.86 -fopenmp - MIN: 296.56 / MAX: 311.25 1. (CXX) g++ options: -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 Clang GCC 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 56.61 53.28 -fopenmp=libomp - MIN: 56.51 / MAX: 56.7 -fopenmp - MIN: 53.24 / MAX: 53.39 1. (CXX) g++ options: -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 Clang GCC 70 140 210 280 350 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 330.09 321.83 -fopenmp=libomp - MIN: 329.96 / MAX: 330.29 -fopenmp - MIN: 321.48 / MAX: 322.34 1. (CXX) g++ options: -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M Clang GCC 500 1000 1500 2000 2500 SE +/- 20.43, N = 3 SE +/- 44.51, N = 7 2520.6 2329.9 -funroll-loops -static-libgcc -static-libstdc++ 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M Clang GCC 600 1200 1800 2400 3000 SE +/- 38.43, N = 3 SE +/- 20.64, N = 3 2676.2 2541.9 -funroll-loops -static-libgcc -static-libstdc++ 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed Clang GCC 800 1600 2400 3200 4000 SE +/- 29.67, N = 3 SE +/- 7.65, N = 3 3525.2 3526.6 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed Clang GCC 1000 2000 3000 4000 5000 SE +/- 3.94, N = 3 SE +/- 0.42, N = 3 4234.8 4821.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed Clang GCC 200 400 600 800 1000 SE +/- 10.66, N = 3 SE +/- 6.29, N = 3 880.2 866.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed Clang GCC 1100 2200 3300 4400 5500 SE +/- 0.60, N = 3 SE +/- 14.08, N = 3 4385.7 4986.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed Clang GCC 6 12 18 24 30 SE +/- 0.20, N = 3 SE +/- 0.17, N = 3 25.9 26.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed Clang GCC 1000 2000 3000 4000 5000 SE +/- 3.23, N = 3 SE +/- 6.68, N = 3 3932.4 4513.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed Clang GCC 60 120 180 240 300 SE +/- 3.71, N = 3 SE +/- 2.09, N = 3 272.4 265.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed Clang GCC 1100 2200 3300 4400 5500 SE +/- 0.98, N = 3 SE +/- 5.93, N = 3 4647.8 5241.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Compression Speed Clang GCC 150 300 450 600 750 SE +/- 10.06, N = 15 SE +/- 4.40, N = 15 691.9 716.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Decompression Speed Clang GCC 1200 2400 3600 4800 6000 SE +/- 1.01, N = 15 SE +/- 1.43, N = 15 4854.7 5478.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed Clang GCC 5 10 15 20 25 SE +/- 0.26, N = 3 SE +/- 0.29, N = 3 21.0 21.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed Clang GCC 1000 2000 3000 4000 5000 SE +/- 4.49, N = 3 SE +/- 2.75, N = 3 4146.6 4543.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Clang Processor: Apple M2 @ 2.42GHz (4 Cores / 8 Threads), Motherboard: Apple MacBook Air (13 h M2 2022), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z, Graphics: llvmpipe, Network: Broadcom Device 4433 + Broadcom Device 5f71
OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600
Environment Notes: CFLAGS=-O3Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 14 August 2022 13:31 by user phoronix.
GCC Processor: Apple M2 @ 2.42GHz (4 Cores / 8 Threads), Motherboard: Apple MacBook Air (13 h M2 2022), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z, Graphics: llvmpipe, Network: Broadcom Device 4433 + Broadcom Device 5f71
OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: GCC 12.1.0 + Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600
Environment Notes: CFLAGS=-O3Compiler Notes: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++ --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-linker-hash-style=gnuProcessor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 15 August 2022 06:30 by user phoronix.