Apple M1 compiler testing for a future article.
GCC 11.2.0 Processor: Apple M1 @ 2.06GHz (4 Cores / 8 Threads), Motherboard: Apple Mac mini (M1 2020), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Q + 2 x 0GB APPLE SSD AP0256Q, Graphics: llvmpipe, Network: Broadcom NetXtreme BCM57762 PCIe + Broadcom BRCM4378 + Broadcom Device 5f69
OS: Arch Linux ARM, Kernel: 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64), Desktop: KDE Plasma 5.24.4, Display Server: X Server 1.21.1.3, OpenGL: 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits), Compiler: GCC 11.2.0 + Clang 13.0.1, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto"Compiler Notes: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++,d --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-isl --with-linker-hash-style=gnuDisk Notes: MQ-DEADLINE / relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Clang 13.0.1 OS: Arch Linux ARM, Kernel: 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64), Desktop: KDE Plasma 5.24.4, Display Server: X Server 1.21.1.3, OpenGL: 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits), Compiler: Clang 13.0.1, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto"Disk Notes: MQ-DEADLINE / relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Apple M1 Compilers OpenBenchmarking.org Phoronix Test Suite Apple M1 @ 2.06GHz (4 Cores / 8 Threads) Apple Mac mini (M1 2020) 8GB 251GB APPLE SSD AP0256Q + 2 x 0GB APPLE SSD AP0256Q llvmpipe Broadcom NetXtreme BCM57762 PCIe + Broadcom BRCM4378 + Broadcom Device 5f69 Arch Linux ARM 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64) KDE Plasma 5.24.4 X Server 1.21.1.3 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits) GCC 11.2.0 + Clang 13.0.1 Clang 13.0.1 ext4 1920x1080 Processor Motherboard Memory Disk Graphics Network OS Kernel Desktop Display Server OpenGL Compilers File-System Screen Resolution Apple M1 Compilers Performance System Logs - CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto" - GCC 11.2.0: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++,d --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-isl --with-linker-hash-style=gnu - MQ-DEADLINE / relatime,rw / Block Size: 4096 - Scaling Governor: apple-cpufreq schedutil - Python 3.10.4 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
GCC 11.2.0 vs. Clang 13.0.1 Comparison Phoronix Test Suite Baseline +43.2% +43.2% +86.4% +86.4% +129.6% +129.6% 74.9% 35.4% 31.8% 31.7% 31.7% 30% 28.3% 19.6% 15.4% 6.1% 5.7% 5.1% 4.8% 4.3% 3.9% 3.6% 3.3% 3.2% 3.2% 3.1% 3.1% 3.1% 2.7% 2.2% CPU - resnet50 172.7% CPU - alexnet 168.1% CPU - resnet18 148.7% CPU - vgg16 138.1% CPU - mnasnet 132.5% CPU - efficientnet-b0 127.5% CPU-v2-v2 - mobilenet-v2 123.8% CPU-v3-v3 - mobilenet-v3 106% CPU - yolov4-tiny 90.5% CPU - googlenet 88.5% Vector Math CPU - shufflenet-v2 73.3% Unkeyed Algorithms 59.4% CPU - mobilenet 54.2% CPU - squeezenet_ssd 41.3% CPU - regnety_400m 37.4% Total Time - 4.1.R.P.P 36.3% Keyed Algorithms 35.7% Memory Copying 4 - 256 - 57 2 - 256 - 57 1 - 256 - 57 8 - 256 - 57 Matrix Math 2048 x 2048 - Total Time 21.6% CoreMark Size 666 - I.P.S 21.3% WAV To FLAC All Algorithms 16% Trace Time 2 12.7% WAV To WavPack 12.3% WAV To MP3 12.2% I.E.C.P.K.A P.P.S 5.8% 3, Long Mode - Compression Speed 5.6% SHA256 T.T.S.S 5.1% 6, Lossless 6 D.T 4.2% 19 - D.S 9 - D.S 3.6% 3 - D.S 3.6% N.C.P.M 3 - D.S 19, Long Mode - D.S 3, Long Mode - D.S 8 - Compression Speed 3.1% 10, Lossless 8, Long Mode - D.S 8 - D.S Timed Time - Size 1,000 3% Eigen 19 - Compression Speed NCNN NCNN NCNN NCNN NCNN NCNN NCNN NCNN NCNN NCNN Stress-NG NCNN Crypto++ NCNN NCNN NCNN C-Ray Crypto++ Stress-NG Liquid-DSP Liquid-DSP Liquid-DSP Liquid-DSP Stress-NG AOBench Coremark FLAC Audio Encoding Crypto++ POV-Ray libavif avifenc WavPack Audio Encoding LAME MP3 Encoding Crypto++ Himeno Benchmark Zstd Compression libavif avifenc OpenSSL eSpeak-NG Speech Engine libavif avifenc libavif avifenc libjpeg-turbo tjbench Zstd Compression LZ4 Compression LZ4 Compression OpenJPEG Zstd Compression Zstd Compression Zstd Compression Zstd Compression libavif avifenc Zstd Compression Zstd Compression SQLite Speedtest LeelaChessZero Zstd Compression GCC 11.2.0 Clang 13.0.1
Apple M1 Compilers cryptopp: All Algorithms cryptopp: Keyed Algorithms cryptopp: Unkeyed Algorithms cryptopp: Integer + Elliptic Curve Public Key Algorithms lczero: Eigen xmrig: Monero - 1M xmrig: Wownero - 1M compress-lz4: 1 - Compression Speed compress-lz4: 1 - Decompression Speed compress-lz4: 3 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 9 - Decompression Speed compress-zstd: 3 - Compression Speed compress-zstd: 3 - Decompression Speed compress-zstd: 8 - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 3, Long Mode - Compression Speed compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed coremark: CoreMark Size 666 - Iterations Per Second himeno: Poisson Pressure Solver avifenc: 0 avifenc: 2 avifenc: 6 avifenc: 6, Lossless avifenc: 10, Lossless c-ray: Total Time - 4K, 16 Rays Per Pixel povray: Trace Time primesieve: 1e12 Prime Number Generation aobench: 2048 x 2048 - Total Time encode-flac: WAV To FLAC encode-mp3: WAV To MP3 espeak: Text-To-Speech Synthesis openjpeg: NASA Curiosity Panorama M34 openssl: SHA256 openssl: RSA4096 openssl: RSA4096 liquid-dsp: 1 - 256 - 57 liquid-dsp: 2 - 256 - 57 liquid-dsp: 4 - 256 - 57 liquid-dsp: 8 - 256 - 57 tjbench: Decompression Throughput sqlite-speedtest: Timed Time - Size 1,000 draco: Lion draco: Church Facade stress-ng: Crypto stress-ng: IO_uring stress-ng: Matrix Math stress-ng: Vector Math stress-ng: Memory Copying stress-ng: Socket Activity ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m encode-wavpack: WAV To WavPack GCC 11.2.0 Clang 13.0.1 954.956113 508.836448 539.281827 1766.985880 1263 2247.2 2798.2 21909.45 27018.5 51.99 17490.9 48.94 17478.5 3341.2 3850.2 721.5 4016.4 22.7 3546.2 240.0 4221.1 693.0 4416.3 18.8 3765.4 179896.599411 7577.316534 287.397 143.442 14.094 15.653 6.070 64.437 72.017 29.118 27.458 70.648 7.239 22.289 53890 8059691050 1408.5 99370.5 28778667 57611000 115230000 151120000 206.177350 51.372 3747 5649 1511.75 144281.67 23588.96 23954.10 2763.25 4331.71 14.40 2.61 2.34 2.17 2.52 4.18 13.32 33.78 7.31 11.81 17.16 17.20 14.26 5.88 17.205 823.153532 374.896175 338.408369 1875.523520 1297 2209.7 2804.8 21875.54 26736.4 51.32 16877.4 49.89 16863.3 3301.1 3977.7 699.6 4141.0 23.2 3684.3 253.7 4356.6 703.4 4553.4 18.8 3887.3 148361.362440 7158.970486 303.550 161.612 13.516 14.929 5.887 87.824 62.416 29.626 33.402 59.074 8.124 23.429 52024 8474527350 1391.4 99445.4 37897333 75898667 151820000 196510000 197.945225 52.900 3772 5722 1527.17 147040.98 30254.21 41899.94 3741.17 4313.48 22.21 5.84 4.82 3.76 5.86 9.51 25.11 80.44 18.18 31.66 46.79 32.77 20.15 8.08 19.320 OpenBenchmarking.org
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Keyed Algorithms GCC 11.2.0 Clang 13.0.1 110 220 330 440 550 SE +/- 0.07, N = 3 SE +/- 1.08, N = 3 508.84 374.90 1. (CXX) g++ options: -O3 -flto -fPIC -pthread -pipe
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Unkeyed Algorithms GCC 11.2.0 Clang 13.0.1 120 240 360 480 600 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 539.28 338.41 1. (CXX) g++ options: -O3 -flto -fPIC -pthread -pipe
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Integer + Elliptic Curve Public Key Algorithms GCC 11.2.0 Clang 13.0.1 400 800 1200 1600 2000 SE +/- 0.67, N = 3 SE +/- 1.78, N = 3 1766.99 1875.52 1. (CXX) g++ options: -O3 -flto -fPIC -pthread -pipe
LeelaChessZero LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen GCC 11.2.0 Clang 13.0.1 300 600 900 1200 1500 SE +/- 10.69, N = 3 SE +/- 18.26, N = 3 1263 1297 1. (CXX) g++ options: -flto -O3 -pthread
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M GCC 11.2.0 Clang 13.0.1 500 1000 1500 2000 2500 SE +/- 9.05, N = 3 SE +/- 7.70, N = 3 2247.2 2209.7 -static-libgcc -static-libstdc++ -funroll-loops 1. (CXX) g++ options: -O3 -flto -fexceptions -fno-rtti -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M GCC 11.2.0 Clang 13.0.1 600 1200 1800 2400 3000 SE +/- 1.83, N = 3 SE +/- 1.95, N = 3 2798.2 2804.8 -static-libgcc -static-libstdc++ -funroll-loops 1. (CXX) g++ options: -O3 -flto -fexceptions -fno-rtti -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed GCC 11.2.0 Clang 13.0.1 6K 12K 18K 24K 30K SE +/- 1.47, N = 3 SE +/- 8.86, N = 3 27018.5 26736.4 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed GCC 11.2.0 Clang 13.0.1 4K 8K 12K 16K 20K SE +/- 0.40, N = 3 SE +/- 3.46, N = 3 17490.9 16877.4 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed GCC 11.2.0 Clang 13.0.1 4K 8K 12K 16K 20K SE +/- 1.03, N = 3 SE +/- 3.18, N = 3 17478.5 16863.3 1. (CC) gcc options: -O3
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed GCC 11.2.0 Clang 13.0.1 700 1400 2100 2800 3500 SE +/- 6.19, N = 3 SE +/- 39.46, N = 3 3341.2 3301.1 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed GCC 11.2.0 Clang 13.0.1 900 1800 2700 3600 4500 SE +/- 0.87, N = 3 SE +/- 0.75, N = 3 3850.2 3977.7 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed GCC 11.2.0 Clang 13.0.1 160 320 480 640 800 SE +/- 3.70, N = 3 SE +/- 4.97, N = 3 721.5 699.6 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed GCC 11.2.0 Clang 13.0.1 900 1800 2700 3600 4500 SE +/- 1.95, N = 3 SE +/- 3.02, N = 3 4016.4 4141.0 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed GCC 11.2.0 Clang 13.0.1 6 12 18 24 30 SE +/- 0.07, N = 3 SE +/- 0.17, N = 3 22.7 23.2 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed GCC 11.2.0 Clang 13.0.1 800 1600 2400 3200 4000 SE +/- 0.15, N = 3 SE +/- 1.62, N = 3 3546.2 3684.3 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed GCC 11.2.0 Clang 13.0.1 60 120 180 240 300 SE +/- 2.00, N = 15 SE +/- 3.51, N = 3 240.0 253.7 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed GCC 11.2.0 Clang 13.0.1 900 1800 2700 3600 4500 SE +/- 0.25, N = 15 SE +/- 0.40, N = 3 4221.1 4356.6 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Compression Speed GCC 11.2.0 Clang 13.0.1 150 300 450 600 750 SE +/- 2.35, N = 3 SE +/- 2.38, N = 3 693.0 703.4 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Decompression Speed GCC 11.2.0 Clang 13.0.1 1000 2000 3000 4000 5000 SE +/- 1.55, N = 3 SE +/- 3.13, N = 3 4416.3 4553.4 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed GCC 11.2.0 Clang 13.0.1 5 10 15 20 25 SE +/- 0.13, N = 3 SE +/- 0.21, N = 4 18.8 18.8 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed GCC 11.2.0 Clang 13.0.1 800 1600 2400 3200 4000 SE +/- 0.92, N = 3 SE +/- 0.69, N = 4 3765.4 3887.3 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 2 GCC 11.2.0 Clang 13.0.1 40 80 120 160 200 SE +/- 0.32, N = 3 SE +/- 0.72, N = 3 143.44 161.61 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 6, Lossless GCC 11.2.0 Clang 13.0.1 4 8 12 16 20 SE +/- 0.18, N = 3 SE +/- 0.21, N = 3 15.65 14.93 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 10, Lossless GCC 11.2.0 Clang 13.0.1 2 4 6 8 10 SE +/- 0.049, N = 3 SE +/- 0.047, N = 3 6.070 5.887 1. (CXX) g++ options: -O3 -fPIC -flto -lm
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 11.2.0 Clang 13.0.1 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 64.44 87.82 1. (CC) gcc options: -lm -lpthread -O3 -flto
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time GCC 11.2.0 Clang 13.0.1 16 32 48 64 80 SE +/- 0.85, N = 4 SE +/- 0.64, N = 5 72.02 62.42 -R/usr/lib 1. (CXX) g++ options: -pipe -O3 -ffast-math -flto -lSDL -lpthread -lXpm -lSM -lICE -lX11 -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
eSpeak-NG Speech Engine This test times how long it takes the eSpeak speech synthesizer to read Project Gutenberg's The Outline of Science and output to a WAV file. This test profile is now tracking the eSpeak-NG version of eSpeak. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis GCC 11.2.0 Clang 13.0.1 6 12 18 24 30 SE +/- 0.03, N = 4 SE +/- 0.03, N = 4 22.29 23.43 1. (CC) gcc options: -O3 -flto -std=c99 -lpthread -lm
OpenJPEG OpenJPEG is an open-source JPEG 2000 codec written in the C programming language. The default input for this test profile is the NASA/JPL-Caltech/MSSS Curiosity panorama 717MB TIFF image file converting to JPEG2000 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 GCC 11.2.0 Clang 13.0.1 12K 24K 36K 48K 60K SE +/- 92.73, N = 3 SE +/- 161.48, N = 3 53890 52024 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 GCC 11.2.0 Clang 13.0.1 2000M 4000M 6000M 8000M 10000M SE +/- 12283962.01, N = 3 SE +/- 3887401.32, N = 3 8059691050 8474527350 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -flto -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC 11.2.0 Clang 13.0.1 300 600 900 1200 1500 SE +/- 0.78, N = 3 SE +/- 0.15, N = 3 1408.5 1391.4 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -flto -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC 11.2.0 Clang 13.0.1 20K 40K 60K 80K 100K SE +/- 18.59, N = 3 SE +/- 16.80, N = 3 99370.5 99445.4 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -flto -lssl -lcrypto -ldl
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 GCC 11.2.0 Clang 13.0.1 8M 16M 24M 32M 40M SE +/- 3527.67, N = 3 SE +/- 2905.93, N = 3 28778667 37897333 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 GCC 11.2.0 Clang 13.0.1 16M 32M 48M 64M 80M SE +/- 2081.67, N = 3 SE +/- 1763.83, N = 3 57611000 75898667 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 GCC 11.2.0 Clang 13.0.1 30M 60M 90M 120M 150M SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 115230000 151820000 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 GCC 11.2.0 Clang 13.0.1 40M 80M 120M 160M 200M SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 151120000 196510000 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Lion GCC 11.2.0 Clang 13.0.1 800 1600 2400 3200 4000 SE +/- 2.73, N = 3 SE +/- 0.58, N = 3 3747 3772 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Church Facade GCC 11.2.0 Clang 13.0.1 1200 2400 3600 4800 6000 SE +/- 7.21, N = 3 SE +/- 3.79, N = 3 5649 5722 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: IO_uring GCC 11.2.0 Clang 13.0.1 30K 60K 90K 120K 150K SE +/- 28.54, N = 3 SE +/- 271.95, N = 3 144281.67 147040.98 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: Matrix Math GCC 11.2.0 Clang 13.0.1 6K 12K 18K 24K 30K SE +/- 332.61, N = 3 SE +/- 0.69, N = 3 23588.96 30254.21 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: Vector Math GCC 11.2.0 Clang 13.0.1 9K 18K 27K 36K 45K SE +/- 195.44, N = 15 SE +/- 2.19, N = 3 23954.10 41899.94 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: Memory Copying GCC 11.2.0 Clang 13.0.1 800 1600 2400 3200 4000 SE +/- 6.71, N = 3 SE +/- 15.21, N = 3 2763.25 3741.17 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: Socket Activity GCC 11.2.0 Clang 13.0.1 900 1800 2700 3600 4500 SE +/- 4.58, N = 3 SE +/- 13.20, N = 3 4331.71 4313.48 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mobilenet GCC 11.2.0 Clang 13.0.1 5 10 15 20 25 SE +/- 0.17, N = 3 SE +/- 0.01, N = 3 14.40 22.21 -lgomp -lpthread - MIN: 9.21 / MAX: 25.2 MIN: 22.15 / MAX: 22.25 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 GCC 11.2.0 Clang 13.0.1 1.314 2.628 3.942 5.256 6.57 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 2.61 5.84 -lgomp -lpthread - MIN: 2.48 / MAX: 12.2 MIN: 5.81 / MAX: 5.87 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 GCC 11.2.0 Clang 13.0.1 1.0845 2.169 3.2535 4.338 5.4225 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.34 4.82 -lgomp -lpthread - MIN: 2.32 / MAX: 2.49 MIN: 4.8 / MAX: 4.85 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: shufflenet-v2 GCC 11.2.0 Clang 13.0.1 0.846 1.692 2.538 3.384 4.23 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 2.17 3.76 -lgomp -lpthread - MIN: 2.15 / MAX: 2.48 MAX: 3.85 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mnasnet GCC 11.2.0 Clang 13.0.1 1.3185 2.637 3.9555 5.274 6.5925 SE +/- 0.01, N = 3 SE +/- 0.01, N = 2 2.52 5.86 -lgomp -lpthread - MIN: 2.48 / MAX: 2.84 MIN: 5.84 / MAX: 5.87 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: efficientnet-b0 GCC 11.2.0 Clang 13.0.1 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 4.18 9.51 -lgomp -lpthread - MIN: 4.13 / MAX: 8.1 MIN: 9.47 / MAX: 9.67 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: googlenet GCC 11.2.0 Clang 13.0.1 6 12 18 24 30 SE +/- 0.10, N = 3 SE +/- 0.01, N = 3 13.32 25.11 -lgomp -lpthread - MIN: 9.14 / MAX: 21.97 MIN: 25.07 / MAX: 25.16 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: vgg16 GCC 11.2.0 Clang 13.0.1 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.01, N = 3 33.78 80.44 -lgomp -lpthread - MIN: 30.68 / MAX: 45.72 MIN: 80.22 / MAX: 80.95 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet18 GCC 11.2.0 Clang 13.0.1 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 7.31 18.18 -lgomp -lpthread - MIN: 6.17 / MAX: 16.92 MIN: 18.14 / MAX: 18.23 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: alexnet GCC 11.2.0 Clang 13.0.1 7 14 21 28 35 SE +/- 0.10, N = 3 SE +/- 0.00, N = 3 11.81 31.66 -lgomp -lpthread - MIN: 9.48 / MAX: 21.58 MIN: 31.62 / MAX: 33.42 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet50 GCC 11.2.0 Clang 13.0.1 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 17.16 46.79 -lgomp -lpthread - MIN: 15.54 / MAX: 27.86 MIN: 46.7 / MAX: 46.9 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: yolov4-tiny GCC 11.2.0 Clang 13.0.1 8 16 24 32 40 SE +/- 0.07, N = 3 SE +/- 0.00, N = 3 17.20 32.77 -lgomp -lpthread - MIN: 14.01 / MAX: 27.33 MIN: 32.68 / MAX: 32.88 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: squeezenet_ssd GCC 11.2.0 Clang 13.0.1 5 10 15 20 25 SE +/- 0.17, N = 3 SE +/- 0.00, N = 3 14.26 20.15 -lgomp -lpthread - MIN: 9.6 / MAX: 28.57 MIN: 20.08 / MAX: 20.21 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: regnety_400m GCC 11.2.0 Clang 13.0.1 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 5.88 8.08 -lgomp -lpthread - MIN: 5.78 / MAX: 8.62 MIN: 8.05 / MAX: 8.15 1. (CXX) g++ options: -O3 -flto -rdynamic
GCC 11.2.0 Processor: Apple M1 @ 2.06GHz (4 Cores / 8 Threads), Motherboard: Apple Mac mini (M1 2020), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Q + 2 x 0GB APPLE SSD AP0256Q, Graphics: llvmpipe, Network: Broadcom NetXtreme BCM57762 PCIe + Broadcom BRCM4378 + Broadcom Device 5f69
OS: Arch Linux ARM, Kernel: 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64), Desktop: KDE Plasma 5.24.4, Display Server: X Server 1.21.1.3, OpenGL: 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits), Compiler: GCC 11.2.0 + Clang 13.0.1, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto"Compiler Notes: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++,d --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-isl --with-linker-hash-style=gnuDisk Notes: MQ-DEADLINE / relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 8 April 2022 14:42 by user phoronix.
Clang 13.0.1 Processor: Apple M1 @ 2.06GHz (4 Cores / 8 Threads), Motherboard: Apple Mac mini (M1 2020), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Q + 2 x 0GB APPLE SSD AP0256Q, Graphics: llvmpipe, Network: Broadcom NetXtreme BCM57762 PCIe + Broadcom BRCM4378 + Broadcom Device 5f69
OS: Arch Linux ARM, Kernel: 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64), Desktop: KDE Plasma 5.24.4, Display Server: X Server 1.21.1.3, OpenGL: 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits), Compiler: Clang 13.0.1, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto"Disk Notes: MQ-DEADLINE / relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 9 April 2022 12:49 by user phoronix.