Apple M2 compiler benchmarks for a future article by Michael Larabel.
Clang Processor: Apple M2 @ 2.42GHz (4 Cores / 8 Threads), Motherboard: Apple MacBook Air (13 h M2 2022), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z, Graphics: llvmpipe, Network: Broadcom Device 4433 + Broadcom Device 5f71
OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600
Environment Notes: CFLAGS=-O3Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
GCC OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: GCC 12.1.0 + Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600
Environment Notes: CFLAGS=-O3Compiler Notes: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++ --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-linker-hash-style=gnuProcessor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet GCC Clang 1600 3200 4800 6400 8000 SE +/- 2.99, N = 3 SE +/- 63.12, N = 8 5229.21 7411.32 -fopenmp - MIN: 5076.63 / MAX: 5318.22 -fopenmp=libomp - MIN: 5728.98 / MAX: 13889.4 1. (CXX) g++ options: -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M GCC Clang 500 1000 1500 2000 2500 SE +/- 44.51, N = 7 SE +/- 20.43, N = 3 2329.9 2520.6 -static-libgcc -static-libstdc++ -funroll-loops 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: FastestDet GCC Clang 0.9113 1.8226 2.7339 3.6452 4.5565 SE +/- 0.11, N = 3 SE +/- 0.40, N = 11 1.87 4.05 -lgomp - MIN: 1.69 / MAX: 12.73 -lomp - MIN: 2.43 / MAX: 8.08 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer GCC Clang 120 240 360 480 600 SE +/- 0.71, N = 3 SE +/- 10.55, N = 12 501.17 544.38 -lgomp - MIN: 475.83 / MAX: 544.85 -lomp - MIN: 375.2 / MAX: 1425.41 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m GCC Clang 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.41, N = 12 5.19 10.17 -lgomp - MIN: 5.13 / MAX: 9.15 -lomp - MIN: 7.32 / MAX: 18.84 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: squeezenet_ssd GCC Clang 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 1.64, N = 12 9.28 12.09 -lgomp - MIN: 7.07 / MAX: 20.52 -lomp - MIN: 7.83 / MAX: 31.07 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: yolov4-tiny GCC Clang 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 1.95, N = 12 14.51 20.24 -lgomp - MIN: 12.49 / MAX: 23.48 -lomp - MIN: 13.87 / MAX: 70.08 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 GCC Clang 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 3.28, N = 12 15.24 28.71 -lgomp - MIN: 13.38 / MAX: 25.59 -lomp - MIN: 13.32 / MAX: 216.5 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: alexnet GCC Clang 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.48, N = 12 12.39 11.80 -lgomp - MIN: 8.84 / MAX: 22.79 -lomp - MIN: 9.79 / MAX: 24.49 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet18 GCC Clang 2 4 6 8 10 SE +/- 0.18, N = 3 SE +/- 0.05, N = 12 7.04 5.90 -lgomp - MIN: 5.13 / MAX: 19.31 -lomp - MIN: 5.32 / MAX: 7.13 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vgg16 GCC Clang 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 1.89, N = 12 32.80 34.44 -lgomp - MIN: 27.43 / MAX: 42.65 -lomp - MIN: 27.08 / MAX: 74.43 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet GCC Clang 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.19, N = 12 12.72 10.83 -lgomp - MIN: 8.11 / MAX: 27.99 -lomp - MIN: 8.82 / MAX: 22.43 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: blazeface GCC Clang 0.4703 0.9406 1.4109 1.8812 2.3515 SE +/- 0.05, N = 3 SE +/- 0.06, N = 12 1.90 2.09 -lgomp - MIN: 0.98 / MAX: 7.72 -lomp - MIN: 1.37 / MAX: 3.21 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 GCC Clang 1.251 2.502 3.753 5.004 6.255 SE +/- 0.02, N = 3 SE +/- 0.11, N = 12 3.26 5.56 -lgomp - MIN: 3.21 / MAX: 12.85 -lomp - MIN: 4.41 / MAX: 15.78 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet GCC Clang 0.7943 1.5886 2.3829 3.1772 3.9715 SE +/- 0.09, N = 3 SE +/- 0.07, N = 12 2.28 3.53 -lgomp - MIN: 2.11 / MAX: 5.51 -lomp - MIN: 2.72 / MAX: 7.36 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: shufflenet-v2 GCC Clang 0.675 1.35 2.025 2.7 3.375 SE +/- 0.05, N = 3 SE +/- 0.07, N = 12 1.90 3.00 -lgomp - MIN: 1.82 / MAX: 5.07 -lomp - MIN: 2.11 / MAX: 12.45 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 GCC Clang 0.7313 1.4626 2.1939 2.9252 3.6565 SE +/- 0.03, N = 3 SE +/- 0.11, N = 12 2.20 3.25 -lgomp - MIN: 2.08 / MAX: 5.17 -lomp - MIN: 2.49 / MAX: 9.1 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 GCC Clang 0.8145 1.629 2.4435 3.258 4.0725 SE +/- 0.08, N = 3 SE +/- 0.27, N = 12 2.21 3.62 -lgomp - MIN: 2.09 / MAX: 5.32 -lomp - MIN: 2.9 / MAX: 8.87 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mobilenet GCC Clang 3 6 9 12 15 SE +/- 0.15, N = 3 SE +/- 0.34, N = 12 11.99 13.07 -lgomp - MIN: 8.33 / MAX: 23.04 -lomp - MIN: 9.68 / MAX: 25.4 1. (CXX) g++ options: -O3 -rdynamic -lpthread
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC Clang 20 40 60 80 100 SE +/- 0.60, N = 15 SE +/- 1.77, N = 15 79.03 95.54 1. (CC) gcc options: -lm -lpthread -O3
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M GCC Clang 600 1200 1800 2400 3000 SE +/- 20.64, N = 3 SE +/- 38.43, N = 3 2541.9 2676.2 -static-libgcc -static-libstdc++ -funroll-loops 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen GCC Clang 20 40 60 80 100 SE +/- 1.24, N = 15 SE +/- 1.14, N = 15 102 97 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian GCC Clang 40 80 120 160 200 SE +/- 2.09, N = 15 SE +/- 1.33, N = 15 164 121 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl GCC Clang 70 140 210 280 350 SE +/- 4.72, N = 15 SE +/- 6.11, N = 15 322 311 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 GCC Clang 20 40 60 80 100 SE +/- 3.54, N = 15 SE +/- 0.28, N = 3 100.08 76.93 1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE
Gcrypt Library Libgcrypt is a general purpose cryptographic library developed as part of the GnuPG project. This is a benchmark of libgcrypt's integrated benchmark and is measuring the time to run the benchmark command with a cipher/mac/hash repetition count set for 50 times as simple, high level look at the overall crypto performance of the system under test. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 GCC Clang 60 120 180 240 300 SE +/- 0.56, N = 3 SE +/- 0.53, N = 3 258.39 255.91 1. (CC) gcc options: -O3 -fvisibility=hidden -lgpg-error
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 5 GCC Clang 8 16 24 32 40 SE +/- 0.20, N = 15 SE +/- 0.36, N = 15 20.78 35.58 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 GCC Clang 15 30 45 60 75 SE +/- 1.02, N = 15 SE +/- 0.24, N = 3 66.06 56.10 1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced GCC Clang 40 80 120 160 200 SE +/- 1.86, N = 3 SE +/- 1.66, N = 15 171 155 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing GCC Clang 130 260 390 520 650 SE +/- 4.56, N = 15 SE +/- 6.24, N = 3 539 613 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space GCC Clang 300 600 900 1200 1500 SE +/- 18.02, N = 3 SE +/- 12.49, N = 15 1361 950 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 GCC Clang 2000M 4000M 6000M 8000M 10000M SE +/- 13180414.49, N = 3 SE +/- 10759857.48, N = 3 8534811807 9058229720 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Decompression Speed GCC Clang 1200 2400 3600 4800 6000 SE +/- 1.43, N = 15 SE +/- 1.01, N = 15 5478.3 4854.7 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Compression Speed GCC Clang 150 300 450 600 750 SE +/- 4.40, N = 15 SE +/- 10.06, N = 15 716.9 691.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mnasnet GCC Clang 0.8168 1.6336 2.4504 3.2672 4.084 SE +/- 0.08, N = 13 2.23 3.63 -lgomp - MIN: 2.22 / MAX: 2.33 -lomp - MIN: 2.94 / MAX: 4.76 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: regnety_400m GCC Clang 3 6 9 12 15 SE +/- 0.04, N = 4 SE +/- 0.26, N = 13 5.28 10.38 -lgomp - MIN: 5.16 / MAX: 5.43 -lomp - MIN: 8.02 / MAX: 15.57 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: squeezenet_ssd GCC Clang 4 8 12 16 20 SE +/- 0.06, N = 4 SE +/- 0.17, N = 13 14.40 12.64 -lgomp - MIN: 9.79 / MAX: 29.22 -lomp - MIN: 9.67 / MAX: 27.34 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: yolov4-tiny GCC Clang 5 10 15 20 25 SE +/- 0.31, N = 4 SE +/- 0.57, N = 13 14.67 20.18 -lgomp - MIN: 12.77 / MAX: 26.02 -lomp - MIN: 15.44 / MAX: 33.23 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet50 GCC Clang 4 8 12 16 20 SE +/- 0.12, N = 4 SE +/- 0.48, N = 13 15.36 15.69 -lgomp - MIN: 13.43 / MAX: 24.89 -lomp - MIN: 13.48 / MAX: 29.95 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: alexnet GCC Clang 3 6 9 12 15 SE +/- 0.11, N = 4 SE +/- 0.31, N = 13 12.08 11.15 -lgomp - MIN: 9.11 / MAX: 22.77 -lomp - MIN: 10.07 / MAX: 22.61 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet18 GCC Clang 2 4 6 8 10 SE +/- 0.12, N = 4 SE +/- 0.07, N = 13 7.84 6.16 -lgomp - MIN: 5.48 / MAX: 21.03 -lomp - MIN: 5.6 / MAX: 7.49 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: vgg16 GCC Clang 8 16 24 32 40 SE +/- 0.10, N = 4 SE +/- 0.34, N = 13 33.70 29.59 -lgomp - MIN: 28.28 / MAX: 51.28 -lomp - MIN: 27.88 / MAX: 48.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: googlenet GCC Clang 3 6 9 12 15 SE +/- 0.06, N = 4 SE +/- 0.26, N = 13 12.59 11.03 -lgomp - MIN: 8.3 / MAX: 21.52 -lomp - MIN: 9.1 / MAX: 21.81 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: blazeface GCC Clang 0.513 1.026 1.539 2.052 2.565 SE +/- 0.04, N = 4 SE +/- 0.12, N = 13 1.86 2.28 -lgomp - MIN: 1.02 / MAX: 8.5 -lomp - MIN: 1.38 / MAX: 5.9 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: efficientnet-b0 GCC Clang 2 4 6 8 10 SE +/- 0.20, N = 4 SE +/- 0.15, N = 13 3.82 6.55 -lgomp - MIN: 3.55 / MAX: 25.76 -lomp - MIN: 5.13 / MAX: 7.99 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: shufflenet-v2 GCC Clang 0.7043 1.4086 2.1129 2.8172 3.5215 SE +/- 0.00, N = 4 SE +/- 0.07, N = 13 1.92 3.13 -lgomp - MIN: 1.9 / MAX: 2.28 -lomp - MIN: 2.48 / MAX: 4.66 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 GCC Clang 0.7448 1.4896 2.2344 2.9792 3.724 SE +/- 0.01, N = 4 SE +/- 0.08, N = 13 2.06 3.31 -lgomp - MIN: 2.03 / MAX: 2.17 -lomp - MIN: 2.56 / MAX: 7.12 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 GCC Clang 0.7808 1.5616 2.3424 3.1232 3.904 SE +/- 0.01, N = 4 SE +/- 0.06, N = 13 2.20 3.47 -lgomp - MIN: 2.17 / MAX: 2.4 -lomp - MIN: 3.08 / MAX: 5.31 1. (CXX) g++ options: -O3 -rdynamic -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mobilenet GCC Clang 4 8 12 16 20 SE +/- 0.15, N = 4 SE +/- 1.12, N = 13 11.94 15.03 -lgomp - MIN: 8.2 / MAX: 27.45 -lomp - MIN: 10.47 / MAX: 34.45 1. (CXX) g++ options: -O3 -rdynamic -lpthread
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time GCC Clang 20 40 60 80 100 SE +/- 1.04, N = 4 SE +/- 0.91, N = 4 91.46 88.93 -R/usr/lib 1. (CXX) g++ options: -pipe -O3 -ffast-math -lSDL -lXpm -lSM -lICE -lX11 -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 7 GCC Clang 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 9.78 13.12 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
Basis Universal Basis Universal is a GPU texture codec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 GCC Clang 16 32 48 64 80 SE +/- 0.74, N = 3 SE +/- 0.68, N = 3 70.00 70.16 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed GCC Clang 1000 2000 3000 4000 5000 SE +/- 2.75, N = 3 SE +/- 4.49, N = 3 4543.4 4146.6 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed GCC Clang 5 10 15 20 25 SE +/- 0.29, N = 3 SE +/- 0.26, N = 3 21.9 21.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: DistinctUserID GCC Clang 1.0013 2.0026 3.0039 4.0052 5.0065 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.19 4.45 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: PartialTweets GCC Clang 0.9788 1.9576 2.9364 3.9152 4.894 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.18 4.35 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: TopTweet GCC Clang 0.9968 1.9936 2.9904 3.9872 4.984 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.25 4.43 1. (CXX) g++ options: -O3
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 GCC Clang 40M 80M 120M 160M 200M SE +/- 521674.65, N = 3 SE +/- 1293649.64, N = 15 167353333 190544667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate GCC Clang 300 600 900 1200 1500 SE +/- 2.60, N = 3 SE +/- 11.02, N = 3 1586 1571 1. (CC) gcc options: -fopenmp -O3 -lwebp -lwebpmux -llcms2 -ltiff -lfreetype -ljasper -ljpeg -lwmflite -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC Clang 20K 40K 60K 80K 100K SE +/- 1045.44, N = 3 SE +/- 712.36, N = 3 106464.4 107954.2 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC Clang 300 600 900 1200 1500 SE +/- 15.18, N = 3 SE +/- 10.91, N = 3 1520.3 1529.5 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed GCC Clang 1000 2000 3000 4000 5000 SE +/- 6.68, N = 3 SE +/- 3.23, N = 3 4513.1 3932.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed GCC Clang 6 12 18 24 30 SE +/- 0.17, N = 3 SE +/- 0.20, N = 3 26.8 25.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: Kostya GCC Clang 0.6863 1.3726 2.0589 2.7452 3.4315 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.05 3.03 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: LargeRandom GCC Clang 0.234 0.468 0.702 0.936 1.17 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.04 1.00 1. (CXX) g++ options: -O3
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 GCC Clang 90 180 270 360 450 SE +/- 0.23, N = 3 SE +/- 3.88, N = 7 306.61 426.56 -fopenmp - MIN: 296.56 / MAX: 311.25 -fopenmp=libomp - MIN: 350.24 / MAX: 452.86 1. (CXX) g++ options: -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed GCC Clang 1100 2200 3300 4400 5500 SE +/- 14.08, N = 3 SE +/- 0.60, N = 3 4986.0 4385.7 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed GCC Clang 200 400 600 800 1000 SE +/- 6.29, N = 3 SE +/- 10.66, N = 3 866.1 880.2 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed GCC Clang 1100 2200 3300 4400 5500 SE +/- 5.93, N = 3 SE +/- 0.98, N = 3 5241.1 4647.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed GCC Clang 60 120 180 240 300 SE +/- 2.09, N = 3 SE +/- 3.71, N = 3 265.4 272.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Basis Universal Basis Universal is a GPU texture codec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 GCC Clang 8 16 24 32 40 SE +/- 0.14, N = 3 SE +/- 0.24, N = 3 34.12 35.88 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed GCC Clang 1000 2000 3000 4000 5000 SE +/- 0.42, N = 3 SE +/- 3.94, N = 3 4821.4 4234.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed GCC Clang 800 1600 2400 3200 4000 SE +/- 7.65, N = 3 SE +/- 29.67, N = 3 3526.6 3525.2 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt GCC Clang 90 180 270 360 450 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 416.06 436.82 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish GCC Clang 90 180 270 360 450 SE +/- 0.06, N = 3 SE +/- 0.13, N = 3 415.07 436.06 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt GCC Clang 80 160 240 320 400 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 348.14 347.60 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish GCC Clang 80 160 240 320 400 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 341.75 351.49 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt GCC Clang 120 240 360 480 600 SE +/- 0.29, N = 3 SE +/- 0.16, N = 3 547.82 570.92 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 GCC Clang 130 260 390 520 650 SE +/- 0.08, N = 3 SE +/- 0.13, N = 3 558.13 578.63 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt GCC Clang 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 137.21 136.99 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 GCC Clang 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 136.75 136.76 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt GCC Clang 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 84.11 91.83 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI GCC Clang 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 84.30 93.79 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC Clang 600 1200 1800 2400 3000 SE +/- 3.41, N = 3 SE +/- 3.27, N = 3 2454.22 2870.78 1. (CC) gcc options: -O3 -lm
Basis Universal Basis Universal is a GPU texture codec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S GCC Clang 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 0.20, N = 3 24.13 26.16 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode GCC Clang 4 8 12 16 20 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 14.30 14.67 -fvisibility=hidden 1. (CXX) g++ options: -logg -lm
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 GCC Clang 70 140 210 280 350 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 321.83 330.09 -fopenmp - MIN: 321.48 / MAX: 322.34 -fopenmp=libomp - MIN: 329.96 / MAX: 330.29 1. (CXX) g++ options: -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 GCC Clang 30M 60M 90M 120M 150M SE +/- 8819.17, N = 3 SE +/- 133832.40, N = 3 121106667 153386667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 GCC Clang 16M 32M 48M 64M 80M SE +/- 4910.31, N = 3 SE +/- 2403.70, N = 3 60534667 76895333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 GCC Clang 8M 16M 24M 32M 40M SE +/- 3511.88, N = 3 SE +/- 4977.73, N = 3 30318000 38463333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Sockperf This is a network socket API performance benchmark developed by Mellanox. This test profile runs both the client and server on the local host for evaluating individual system performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Messages Per Second, More Is Better Sockperf 3.7 Test: Throughput GCC Clang 200K 400K 600K 800K 1000K SE +/- 3804.35, N = 5 SE +/- 6642.87, N = 5 847339 793185 1. (CXX) g++ options: --param -O3 -rdynamic
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 5 GCC Clang 30 60 90 120 150 SE +/- 0.28, N = 3 SE +/- 0.68, N = 3 101.26 114.37 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 GCC Clang 30 60 90 120 150 SE +/- 0.55, N = 3 SE +/- 0.49, N = 3 102.13 114.98 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
OpenJPEG OpenJPEG is an open-source JPEG 2000 codec written in the C programming language. The default input for this test profile is the NASA/JPL-Caltech/MSSS Curiosity panorama 717MB TIFF image file converting to JPEG2000 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 GCC Clang 10K 20K 30K 40K 50K SE +/- 145.89, N = 3 SE +/- 90.00, N = 3 48829 48724 1. (CXX) g++ options: -rdynamic
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 GCC Clang 9 18 27 36 45 SE +/- 0.19, N = 3 SE +/- 0.24, N = 3 38.16 38.65 -Xclang -mrelax-all 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie
Basis Universal Basis Universal is a GPU texture codec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 GCC Clang 2 4 6 8 10 SE +/- 0.007, N = 3 SE +/- 0.012, N = 3 6.003 6.548 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Church Facade GCC Clang 1100 2200 3300 4400 5500 SE +/- 4.58, N = 3 SE +/- 3.71, N = 3 5080 5044 1. (CXX) g++ options: -O3
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 GCC Clang 13 26 39 52 65 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 53.28 56.61 -fopenmp - MIN: 53.24 / MAX: 53.39 -fopenmp=libomp - MIN: 56.51 / MAX: 56.7 1. (CXX) g++ options: -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC Clang 400 800 1200 1600 2000 SE +/- 0.43, N = 3 SE +/- 0.12, N = 3 1987.03 1986.56 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC Clang 1500 3000 4500 6000 7500 SE +/- 6.68, N = 3 SE +/- 8.30, N = 3 5124.15 6907.38 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC Clang 1000 2000 3000 4000 5000 SE +/- 23.87, N = 3 SE +/- 16.69, N = 3 4132.95 4508.82 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC Clang 120 240 360 480 600 SE +/- 0.57, N = 3 SE +/- 0.71, N = 3 563.13 500.91 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC Clang 100 200 300 400 500 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 463.86 450.23 1. (CC) gcc options: -O3 -lm
LuaJIT This test profile is a collection of Lua scripts/benchmarks run against a locally-built copy of LuaJIT upstream. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation GCC Clang 200 400 600 800 1000 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 893.84 893.81 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Dense LU Matrix Factorization GCC Clang 700 1400 2100 2800 3500 SE +/- 7.46, N = 3 SE +/- 0.85, N = 3 3156.17 3157.76 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Sparse Matrix Multiply GCC Clang 400 800 1200 1600 2000 SE +/- 13.70, N = 3 SE +/- 1.66, N = 3 1890.85 1902.44 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Fast Fourier Transform GCC Clang 120 240 360 480 600 SE +/- 0.16, N = 3 SE +/- 0.91, N = 3 560.21 560.45 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Monte Carlo GCC Clang 90 180 270 360 450 SE +/- 2.66, N = 3 SE +/- 0.17, N = 3 429.71 424.27 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
Clang Processor: Apple M2 @ 2.42GHz (4 Cores / 8 Threads), Motherboard: Apple MacBook Air (13 h M2 2022), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z, Graphics: llvmpipe, Network: Broadcom Device 4433 + Broadcom Device 5f71
OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600
Environment Notes: CFLAGS=-O3Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 14 August 2022 13:31 by user phoronix.
GCC Processor: Apple M2 @ 2.42GHz (4 Cores / 8 Threads), Motherboard: Apple MacBook Air (13 h M2 2022), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z, Graphics: llvmpipe, Network: Broadcom Device 4433 + Broadcom Device 5f71
OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: GCC 12.1.0 + Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600
Environment Notes: CFLAGS=-O3Compiler Notes: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++ --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-linker-hash-style=gnuProcessor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 15 August 2022 06:30 by user phoronix.