Apple M1 compiler testing for a future article.
GCC 11.2.0 Processor: Apple M1 @ 2.06GHz (4 Cores / 8 Threads), Motherboard: Apple Mac mini (M1 2020), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Q + 2 x 0GB APPLE SSD AP0256Q, Graphics: llvmpipe, Network: Broadcom NetXtreme BCM57762 PCIe + Broadcom BRCM4378 + Broadcom Device 5f69
OS: Arch Linux ARM, Kernel: 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64), Desktop: KDE Plasma 5.24.4, Display Server: X Server 1.21.1.3, OpenGL: 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits), Compiler: GCC 11.2.0 + Clang 13.0.1, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto"Compiler Notes: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++,d --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-isl --with-linker-hash-style=gnuDisk Notes: MQ-DEADLINE / relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Clang 13.0.1 OS: Arch Linux ARM, Kernel: 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64), Desktop: KDE Plasma 5.24.4, Display Server: X Server 1.21.1.3, OpenGL: 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits), Compiler: Clang 13.0.1, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto"Disk Notes: MQ-DEADLINE / relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Apple M1 Compilers OpenBenchmarking.org Phoronix Test Suite Apple M1 @ 2.06GHz (4 Cores / 8 Threads) Apple Mac mini (M1 2020) 8GB 251GB APPLE SSD AP0256Q + 2 x 0GB APPLE SSD AP0256Q llvmpipe Broadcom NetXtreme BCM57762 PCIe + Broadcom BRCM4378 + Broadcom Device 5f69 Arch Linux ARM 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64) KDE Plasma 5.24.4 X Server 1.21.1.3 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits) GCC 11.2.0 + Clang 13.0.1 Clang 13.0.1 ext4 1920x1080 Processor Motherboard Memory Disk Graphics Network OS Kernel Desktop Display Server OpenGL Compilers File-System Screen Resolution Apple M1 Compilers Performance System Logs - CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto" - GCC 11.2.0: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++,d --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-isl --with-linker-hash-style=gnu - MQ-DEADLINE / relatime,rw / Block Size: 4096 - Scaling Governor: apple-cpufreq schedutil - Python 3.10.4 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
GCC 11.2.0 vs. Clang 13.0.1 Comparison Phoronix Test Suite Baseline +43.2% +43.2% +86.4% +86.4% +129.6% +129.6% 74.9% 35.4% 31.8% 31.7% 31.7% 30% 28.3% 19.6% 15.4% 6.1% 5.7% 5.1% 4.8% 4.3% 3.9% 3.6% 3.3% 3.2% 3.2% 3.1% 3.1% 3.1% 2.7% 2.2% CPU - resnet50 172.7% CPU - alexnet 168.1% CPU - resnet18 148.7% CPU - vgg16 138.1% CPU - mnasnet 132.5% CPU - efficientnet-b0 127.5% CPU-v2-v2 - mobilenet-v2 123.8% CPU-v3-v3 - mobilenet-v3 106% CPU - yolov4-tiny 90.5% CPU - googlenet 88.5% Vector Math CPU - shufflenet-v2 73.3% Unkeyed Algorithms 59.4% CPU - mobilenet 54.2% CPU - squeezenet_ssd 41.3% CPU - regnety_400m 37.4% Total Time - 4.1.R.P.P 36.3% Keyed Algorithms 35.7% Memory Copying 4 - 256 - 57 2 - 256 - 57 1 - 256 - 57 8 - 256 - 57 Matrix Math 2048 x 2048 - Total Time 21.6% CoreMark Size 666 - I.P.S 21.3% WAV To FLAC All Algorithms 16% Trace Time 2 12.7% WAV To WavPack 12.3% WAV To MP3 12.2% I.E.C.P.K.A P.P.S 5.8% 3, Long Mode - Compression Speed 5.6% SHA256 T.T.S.S 5.1% 6, Lossless 6 D.T 4.2% 19 - D.S 9 - D.S 3.6% 3 - D.S 3.6% N.C.P.M 3 - D.S 19, Long Mode - D.S 3, Long Mode - D.S 8 - Compression Speed 3.1% 10, Lossless 8, Long Mode - D.S 8 - D.S Timed Time - Size 1,000 3% Eigen 19 - Compression Speed NCNN NCNN NCNN NCNN NCNN NCNN NCNN NCNN NCNN NCNN Stress-NG NCNN Crypto++ NCNN NCNN NCNN C-Ray Crypto++ Stress-NG Liquid-DSP Liquid-DSP Liquid-DSP Liquid-DSP Stress-NG AOBench Coremark FLAC Audio Encoding Crypto++ POV-Ray libavif avifenc WavPack Audio Encoding LAME MP3 Encoding Crypto++ Himeno Benchmark Zstd Compression libavif avifenc OpenSSL eSpeak-NG Speech Engine libavif avifenc libavif avifenc libjpeg-turbo tjbench Zstd Compression LZ4 Compression LZ4 Compression OpenJPEG Zstd Compression Zstd Compression Zstd Compression Zstd Compression libavif avifenc Zstd Compression Zstd Compression SQLite Speedtest LeelaChessZero Zstd Compression GCC 11.2.0 Clang 13.0.1
Apple M1 Compilers cryptopp: All Algorithms xmrig: Monero - 1M cryptopp: Keyed Algorithms lczero: Eigen xmrig: Wownero - 1M avifenc: 0 cryptopp: Integer + Elliptic Curve Public Key Algorithms openssl: SHA256 avifenc: 2 compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 3, Long Mode - Compression Speed encode-flac: WAV To FLAC povray: Trace Time stress-ng: Vector Math compress-zstd: 19, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet c-ray: Total Time - 4K, 16 Rays Per Pixel cryptopp: Unkeyed Algorithms compress-zstd: 19 - Decompression Speed compress-zstd: 19 - Compression Speed compress-lz4: 9 - Decompression Speed compress-lz4: 9 - Compression Speed openssl: RSA4096 openssl: RSA4096 compress-lz4: 3 - Decompression Speed compress-lz4: 3 - Compression Speed sqlite-speedtest: Timed Time - Size 1,000 compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 8 - Compression Speed compress-zstd: 3 - Decompression Speed compress-zstd: 3 - Compression Speed espeak: Text-To-Speech Synthesis encode-wavpack: WAV To WavPack aobench: 2048 x 2048 - Total Time stress-ng: Crypto stress-ng: Memory Copying stress-ng: IO_uring stress-ng: Matrix Math stress-ng: Socket Activity primesieve: 1e12 Prime Number Generation compress-lz4: 1 - Decompression Speed compress-lz4: 1 - Compression Speed coremark: CoreMark Size 666 - Iterations Per Second avifenc: 6 tjbench: Decompression Throughput liquid-dsp: 8 - 256 - 57 liquid-dsp: 4 - 256 - 57 liquid-dsp: 2 - 256 - 57 liquid-dsp: 1 - 256 - 57 avifenc: 6, Lossless himeno: Poisson Pressure Solver openjpeg: NASA Curiosity Panorama M34 encode-mp3: WAV To MP3 draco: Church Facade avifenc: 10, Lossless draco: Lion GCC 11.2.0 Clang 13.0.1 954.956113 2247.2 508.836448 1263 2798.2 287.397 1766.985880 8059691050 143.442 4221.1 240.0 70.648 72.017 23954.10 3765.4 18.8 5.88 14.26 17.20 17.16 11.81 7.31 33.78 13.32 4.18 2.52 2.17 2.34 2.61 14.40 64.437 539.281827 3546.2 22.7 17478.5 48.94 99370.5 1408.5 17490.9 51.99 51.372 4416.3 693.0 4016.4 721.5 3850.2 3341.2 22.289 17.205 27.458 1511.75 2763.25 144281.67 23588.96 4331.71 29.118 27018.5 21909.45 179896.599411 14.094 206.177350 151120000 115230000 57611000 28778667 15.653 7577.316534 53890 7.239 5649 6.070 3747 823.153532 2209.7 374.896175 1297 2804.8 303.550 1875.523520 8474527350 161.612 4356.6 253.7 59.074 62.416 41899.94 3887.3 18.8 8.08 20.15 32.77 46.79 31.66 18.18 80.44 25.11 9.51 5.86 3.76 4.82 5.84 22.21 87.824 338.408369 3684.3 23.2 16863.3 49.89 99445.4 1391.4 16877.4 51.32 52.900 4553.4 703.4 4141.0 699.6 3977.7 3301.1 23.429 19.320 33.402 1527.17 3741.17 147040.98 30254.21 4313.48 29.626 26736.4 21875.54 148361.362440 13.516 197.945225 196510000 151820000 75898667 37897333 14.929 7158.970486 52024 8.124 5722 5.887 3772 OpenBenchmarking.org
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M GCC 11.2.0 Clang 13.0.1 500 1000 1500 2000 2500 SE +/- 9.05, N = 3 SE +/- 7.70, N = 3 2247.2 2209.7 -static-libgcc -static-libstdc++ -funroll-loops 1. (CXX) g++ options: -O3 -flto -fexceptions -fno-rtti -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
LeelaChessZero LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen GCC 11.2.0 Clang 13.0.1 300 600 900 1200 1500 SE +/- 10.69, N = 3 SE +/- 18.26, N = 3 1263 1297 1. (CXX) g++ options: -flto -O3 -pthread
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M GCC 11.2.0 Clang 13.0.1 600 1200 1800 2400 3000 SE +/- 1.83, N = 3 SE +/- 1.95, N = 3 2798.2 2804.8 -static-libgcc -static-libstdc++ -funroll-loops 1. (CXX) g++ options: -O3 -flto -fexceptions -fno-rtti -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 GCC 11.2.0 Clang 13.0.1 2000M 4000M 6000M 8000M 10000M SE +/- 12283962.01, N = 3 SE +/- 3887401.32, N = 3 8059691050 8474527350 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -flto -lssl -lcrypto -ldl
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed GCC 11.2.0 Clang 13.0.1 900 1800 2700 3600 4500 SE +/- 0.25, N = 15 SE +/- 0.40, N = 3 4221.1 4356.6 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed GCC 11.2.0 Clang 13.0.1 60 120 180 240 300 SE +/- 2.00, N = 15 SE +/- 3.51, N = 3 240.0 253.7 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time GCC 11.2.0 Clang 13.0.1 16 32 48 64 80 SE +/- 0.85, N = 4 SE +/- 0.64, N = 5 72.02 62.42 -R/usr/lib 1. (CXX) g++ options: -pipe -O3 -ffast-math -flto -lSDL -lpthread -lXpm -lSM -lICE -lX11 -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed GCC 11.2.0 Clang 13.0.1 800 1600 2400 3200 4000 SE +/- 0.92, N = 3 SE +/- 0.69, N = 4 3765.4 3887.3 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed GCC 11.2.0 Clang 13.0.1 5 10 15 20 25 SE +/- 0.13, N = 3 SE +/- 0.21, N = 4 18.8 18.8 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: regnety_400m GCC 11.2.0 Clang 13.0.1 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 5.88 8.08 -lgomp -lpthread - MIN: 5.78 / MAX: 8.62 MIN: 8.05 / MAX: 8.15 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: squeezenet_ssd GCC 11.2.0 Clang 13.0.1 5 10 15 20 25 SE +/- 0.17, N = 3 SE +/- 0.00, N = 3 14.26 20.15 -lgomp -lpthread - MIN: 9.6 / MAX: 28.57 MIN: 20.08 / MAX: 20.21 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: yolov4-tiny GCC 11.2.0 Clang 13.0.1 8 16 24 32 40 SE +/- 0.07, N = 3 SE +/- 0.00, N = 3 17.20 32.77 -lgomp -lpthread - MIN: 14.01 / MAX: 27.33 MIN: 32.68 / MAX: 32.88 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet50 GCC 11.2.0 Clang 13.0.1 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 17.16 46.79 -lgomp -lpthread - MIN: 15.54 / MAX: 27.86 MIN: 46.7 / MAX: 46.9 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: alexnet GCC 11.2.0 Clang 13.0.1 7 14 21 28 35 SE +/- 0.10, N = 3 SE +/- 0.00, N = 3 11.81 31.66 -lgomp -lpthread - MIN: 9.48 / MAX: 21.58 MIN: 31.62 / MAX: 33.42 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet18 GCC 11.2.0 Clang 13.0.1 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 7.31 18.18 -lgomp -lpthread - MIN: 6.17 / MAX: 16.92 MIN: 18.14 / MAX: 18.23 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: vgg16 GCC 11.2.0 Clang 13.0.1 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.01, N = 3 33.78 80.44 -lgomp -lpthread - MIN: 30.68 / MAX: 45.72 MIN: 80.22 / MAX: 80.95 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: googlenet GCC 11.2.0 Clang 13.0.1 6 12 18 24 30 SE +/- 0.10, N = 3 SE +/- 0.01, N = 3 13.32 25.11 -lgomp -lpthread - MIN: 9.14 / MAX: 21.97 MIN: 25.07 / MAX: 25.16 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: efficientnet-b0 GCC 11.2.0 Clang 13.0.1 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 4.18 9.51 -lgomp -lpthread - MIN: 4.13 / MAX: 8.1 MIN: 9.47 / MAX: 9.67 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mnasnet GCC 11.2.0 Clang 13.0.1 1.3185 2.637 3.9555 5.274 6.5925 SE +/- 0.01, N = 3 SE +/- 0.01, N = 2 2.52 5.86 -lgomp -lpthread - MIN: 2.48 / MAX: 2.84 MIN: 5.84 / MAX: 5.87 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: shufflenet-v2 GCC 11.2.0 Clang 13.0.1 0.846 1.692 2.538 3.384 4.23 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 2.17 3.76 -lgomp -lpthread - MIN: 2.15 / MAX: 2.48 MAX: 3.85 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 GCC 11.2.0 Clang 13.0.1 1.0845 2.169 3.2535 4.338 5.4225 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.34 4.82 -lgomp -lpthread - MIN: 2.32 / MAX: 2.49 MIN: 4.8 / MAX: 4.85 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 GCC 11.2.0 Clang 13.0.1 1.314 2.628 3.942 5.256 6.57 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 2.61 5.84 -lgomp -lpthread - MIN: 2.48 / MAX: 12.2 MIN: 5.81 / MAX: 5.87 1. (CXX) g++ options: -O3 -flto -rdynamic
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mobilenet GCC 11.2.0 Clang 13.0.1 5 10 15 20 25 SE +/- 0.17, N = 3 SE +/- 0.01, N = 3 14.40 22.21 -lgomp -lpthread - MIN: 9.21 / MAX: 25.2 MIN: 22.15 / MAX: 22.25 1. (CXX) g++ options: -O3 -flto -rdynamic
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 11.2.0 Clang 13.0.1 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 64.44 87.82 1. (CC) gcc options: -lm -lpthread -O3 -flto
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed GCC 11.2.0 Clang 13.0.1 800 1600 2400 3200 4000 SE +/- 0.15, N = 3 SE +/- 1.62, N = 3 3546.2 3684.3 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed GCC 11.2.0 Clang 13.0.1 6 12 18 24 30 SE +/- 0.07, N = 3 SE +/- 0.17, N = 3 22.7 23.2 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC 11.2.0 Clang 13.0.1 20K 40K 60K 80K 100K SE +/- 18.59, N = 3 SE +/- 16.80, N = 3 99370.5 99445.4 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -flto -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC 11.2.0 Clang 13.0.1 300 600 900 1200 1500 SE +/- 0.78, N = 3 SE +/- 0.15, N = 3 1408.5 1391.4 -Qunused-arguments 1. (CC) gcc options: -pthread -O3 -flto -lssl -lcrypto -ldl
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Decompression Speed GCC 11.2.0 Clang 13.0.1 1000 2000 3000 4000 5000 SE +/- 1.55, N = 3 SE +/- 3.13, N = 3 4416.3 4553.4 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Compression Speed GCC 11.2.0 Clang 13.0.1 150 300 450 600 750 SE +/- 2.35, N = 3 SE +/- 2.38, N = 3 693.0 703.4 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed GCC 11.2.0 Clang 13.0.1 900 1800 2700 3600 4500 SE +/- 1.95, N = 3 SE +/- 3.02, N = 3 4016.4 4141.0 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed GCC 11.2.0 Clang 13.0.1 160 320 480 640 800 SE +/- 3.70, N = 3 SE +/- 4.97, N = 3 721.5 699.6 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed GCC 11.2.0 Clang 13.0.1 900 1800 2700 3600 4500 SE +/- 0.87, N = 3 SE +/- 0.75, N = 3 3850.2 3977.7 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed GCC 11.2.0 Clang 13.0.1 700 1400 2100 2800 3500 SE +/- 6.19, N = 3 SE +/- 39.46, N = 3 3341.2 3301.1 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma -llz4
eSpeak-NG Speech Engine This test times how long it takes the eSpeak speech synthesizer to read Project Gutenberg's The Outline of Science and output to a WAV file. This test profile is now tracking the eSpeak-NG version of eSpeak. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis GCC 11.2.0 Clang 13.0.1 6 12 18 24 30 SE +/- 0.03, N = 4 SE +/- 0.03, N = 4 22.29 23.43 1. (CC) gcc options: -O3 -flto -std=c99 -lpthread -lm
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: Memory Copying GCC 11.2.0 Clang 13.0.1 800 1600 2400 3200 4000 SE +/- 6.71, N = 3 SE +/- 15.21, N = 3 2763.25 3741.17 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: IO_uring GCC 11.2.0 Clang 13.0.1 30K 60K 90K 120K 150K SE +/- 28.54, N = 3 SE +/- 271.95, N = 3 144281.67 147040.98 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: Matrix Math GCC 11.2.0 Clang 13.0.1 6K 12K 18K 24K 30K SE +/- 332.61, N = 3 SE +/- 0.69, N = 3 23588.96 30254.21 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.13.02 Test: Socket Activity GCC 11.2.0 Clang 13.0.1 900 1800 2700 3600 4500 SE +/- 4.58, N = 3 SE +/- 13.20, N = 3 4331.71 4313.48 1. (CC) gcc options: -O3 -flto -O2 -std=gnu99 -lm -laio -lbsd -lcrypt -lrt -lz -ldl -pthread -lkmod -lc -latomic
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed GCC 11.2.0 Clang 13.0.1 5K 10K 15K 20K 25K SE +/- 5.50, N = 3 SE +/- 3.05, N = 3 21909.45 21875.54 1. (CC) gcc options: -O3
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 GCC 11.2.0 Clang 13.0.1 40M 80M 120M 160M 200M SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 151120000 196510000 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 GCC 11.2.0 Clang 13.0.1 30M 60M 90M 120M 150M SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 115230000 151820000 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 GCC 11.2.0 Clang 13.0.1 16M 32M 48M 64M 80M SE +/- 2081.67, N = 3 SE +/- 1763.83, N = 3 57611000 75898667 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 GCC 11.2.0 Clang 13.0.1 8M 16M 24M 32M 40M SE +/- 3527.67, N = 3 SE +/- 2905.93, N = 3 28778667 37897333 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenJPEG OpenJPEG is an open-source JPEG 2000 codec written in the C programming language. The default input for this test profile is the NASA/JPL-Caltech/MSSS Curiosity panorama 717MB TIFF image file converting to JPEG2000 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 GCC 11.2.0 Clang 13.0.1 12K 24K 36K 48K 60K SE +/- 92.73, N = 3 SE +/- 161.48, N = 3 53890 52024 1. (CXX) g++ options: -O3 -flto -rdynamic
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Church Facade GCC 11.2.0 Clang 13.0.1 1200 2400 3600 4800 6000 SE +/- 7.21, N = 3 SE +/- 3.79, N = 3 5649 5722 1. (CXX) g++ options: -O3 -flto
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Lion GCC 11.2.0 Clang 13.0.1 800 1600 2400 3200 4000 SE +/- 2.73, N = 3 SE +/- 0.58, N = 3 3747 3772 1. (CXX) g++ options: -O3 -flto
GCC 11.2.0 Processor: Apple M1 @ 2.06GHz (4 Cores / 8 Threads), Motherboard: Apple Mac mini (M1 2020), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Q + 2 x 0GB APPLE SSD AP0256Q, Graphics: llvmpipe, Network: Broadcom NetXtreme BCM57762 PCIe + Broadcom BRCM4378 + Broadcom Device 5f69
OS: Arch Linux ARM, Kernel: 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64), Desktop: KDE Plasma 5.24.4, Display Server: X Server 1.21.1.3, OpenGL: 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits), Compiler: GCC 11.2.0 + Clang 13.0.1, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto"Compiler Notes: --build=aarch64-unknown-linux-gnu --disable-libssp --disable-libstdcxx-pch --disable-multilib --disable-werror --enable-__cxa_atexit --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,fortran,go,lto,objc,obj-c++,d --enable-lto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-unknown-linux-gnu --mandir=/usr/share/man --with-arch=armv8-a --with-isl --with-linker-hash-style=gnuDisk Notes: MQ-DEADLINE / relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 8 April 2022 14:42 by user phoronix.
Clang 13.0.1 Processor: Apple M1 @ 2.06GHz (4 Cores / 8 Threads), Motherboard: Apple Mac mini (M1 2020), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Q + 2 x 0GB APPLE SSD AP0256Q, Graphics: llvmpipe, Network: Broadcom NetXtreme BCM57762 PCIe + Broadcom BRCM4378 + Broadcom Device 5f69
OS: Arch Linux ARM, Kernel: 5.17.0-rc7-asahi-next-20220310-5-2-ARCH (aarch64), Desktop: KDE Plasma 5.24.4, Display Server: X Server 1.21.1.3, OpenGL: 4.5 Mesa 22.0.1 (LLVM 13.0.1 128 bits), Compiler: Clang 13.0.1, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -flto" CFLAGS="-O3 -flto"Disk Notes: MQ-DEADLINE / relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: apple-cpufreq schedutilPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 9 April 2022 12:49 by user phoronix.