Ampere ARMv8 Neoverse-N1 testing with a WIWYNN Mt.Jade (2.03.20210719 SCP: BIOS) and ASPEED on Ubuntu 21.10 via the Phoronix Test Suite.
A Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Java Notes: OpenJDK Runtime Environment (build 11.0.14+9-Ubuntu-0ubuntu2.22.10)Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
B Processor: Ampere ARMv8 Neoverse-N1 @ 3.00GHz (256 Cores), Motherboard: WIWYNN Mt.Jade (2.03.20210719 SCP: BIOS), Chipset: Ampere Computing LLC Altra PCI Root Complex A, Memory: 512GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB SAMSUNG MZ1LB960HAJQ-00007, Graphics: ASPEED, Network: Mellanox MT28908 + Intel I210
OS: Ubuntu 21.10, Kernel: 5.13.0-27-generic (aarch64), Display Server: X Server, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1024x768
new tests may aarch64 OpenBenchmarking.org Phoronix Test Suite Ampere ARMv8 Neoverse-N1 @ 3.00GHz (256 Cores) WIWYNN Mt.Jade (2.03.20210719 SCP Ampere Computing LLC Altra PCI Root Complex A 512GB 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB SAMSUNG MZ1LB960HAJQ-00007 ASPEED Mellanox MT28908 + Intel I210 Ubuntu 21.10 5.13.0-27-generic (aarch64) X Server GCC 11.2.0 ext4 1024x768 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Display Server Compiler File-System Screen Resolution New Tests May Aarch64 Benchmarks System Logs - Transparent Huge Pages: madvise - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) - OpenJDK Runtime Environment (build 11.0.14+9-Ubuntu-0ubuntu2.22.10) - Python 3.9.7 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
A vs. B Comparison Phoronix Test Suite Baseline +20.3% +20.3% +40.6% +40.6% +60.9% +60.9% 81.2% 29.1% 27% 26.2% 23.3% 16.3% 16.2% 12% 8.7% 8.2% 8.2% 6.2% 5.2% 4.8% 2.9% 2.2% C.B.S.A - f32 - CPU M.M.B.S.T - u8s8f32 - CPU 54.5% R.N.N.I - u8s8f32 - CPU R.N.N.T - u8s8f32 - CPU GPT-2 - CPU - Standard C.B.S.A - u8s8f32 - CPU 24.6% R.N.N.T - bf16bf16bf16 - CPU SENDFILE 22.6% 1 ArcFace ResNet-100 - CPU - Standard R.N.N.I - bf16bf16bf16 - CPU 12.5% 200 D.B.s - f32 - CPU 11.9% M.M.B.S.T - f32 - CPU 10.2% 1000 R.N.N.I - f32 - CPU 20 Memcpy 1MB 7.2% CPU Cache 6.6% Futex 6.4% Sched Pipe Epoll Wait D.B.s - u8s8f32 - CPU 4.8% IP Shapes 3D - u8s8f32 - CPU yolov4 - CPU - Standard 4.6% MMAP 4.3% Forking 3.7% Malloc 3.5% 100 3.4% 10, Lossless Throughput 2.8% fcn-resnet101-11 - CPU - Standard 2.7% R.N.N.T - f32 - CPU 2.4% D.B.s - f32 - CPU 2.4% Futex Lock-Pi bertsquad-12 - CPU - Standard 2.2% oneDNN oneDNN oneDNN oneDNN ONNX Runtime oneDNN oneDNN Stress-NG Apache HTTP Server ONNX Runtime oneDNN Apache HTTP Server oneDNN oneDNN Apache HTTP Server oneDNN Apache HTTP Server perf-bench Stress-NG Stress-NG perf-bench perf-bench oneDNN oneDNN ONNX Runtime Stress-NG Stress-NG Stress-NG Apache HTTP Server libavif avifenc Java JMH ONNX Runtime oneDNN oneDNN perf-bench ONNX Runtime A B
new tests may aarch64 perf-bench: Epoll Wait perf-bench: Futex Hash perf-bench: Memcpy 1MB perf-bench: Memset 1MB perf-bench: Sched Pipe perf-bench: Futex Lock-Pi perf-bench: Syscall Basic avifenc: 0 avifenc: 2 avifenc: 6 avifenc: 6, Lossless avifenc: 10, Lossless onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU webp2: Default webp2: Quality 75, Compression Effort 7 webp2: Quality 95, Compression Effort 7 webp2: Quality 100, Compression Effort 5 webp2: Quality 100, Lossless Compression stress-ng: MMAP stress-ng: NUMA stress-ng: Futex stress-ng: MEMFD stress-ng: Atomic stress-ng: Crypto stress-ng: Malloc stress-ng: Forking stress-ng: IO_uring stress-ng: SENDFILE stress-ng: CPU Cache stress-ng: CPU Stress stress-ng: Semaphores stress-ng: Matrix Math stress-ng: Vector Math stress-ng: Memory Copying stress-ng: Socket Activity stress-ng: Context Switching stress-ng: Glibc C String Functions stress-ng: Glibc Qsort Data Sorting stress-ng: System V Message Passing java-jmh: Throughput nginx: 1 nginx: 20 nginx: 100 nginx: 200 nginx: 500 nginx: 1000 onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard apache: 1 apache: 20 apache: 100 apache: 200 apache: 500 apache: 1000 influxdb: 1024 - 10000 - 2,5000,1 - 10000 A B 1064 326958 15.734928 44.74628 200009 45 7827961 252.064 163.437 4.641 7.879 6.131 44.2317 40.815 106.976 166.896 20.8501 56.4449 49.7698 130.806 89.0685 36.1944 16357.1 12883.7 18083 13539.1 34.4241 17223 11352 109.234 3.444 161.268 318.767 5.174 625.647 2051.87 32.37 328915.8 1422.26 1 333734.79 114773687.93 14393.7 1937964.93 1979249.09 419.89 65472.66 18356640.23 1198793 723548.66 9674.49 18596.34 31806812.55 19212257.31 2477.96 2426671.24 493572978516.11 41813.43 83090.65 68830.2 61794.37 63406.6 65624.22 1596 4235 214 203 303 516 40 38 239 229 3558 4242 6546.58 16782.58 37378.82 45403.03 51287.26 44329.5 1119 326997 14.680252 44.742487 212491 46 7850909 252.637 164.675 4.61 7.847 5.958 43.7346 40.9995 106.213 159.304 11.5082 63.1626 50.9578 163.014 93.3418 35.853 16750.3 11904 14239 10490.5 37.9352 13968.6 12769.6 168.787 3.423 161.48 318.017 5.178 624.943 1967.43 32.47 309072.4 1427.31 1 331354.53 110879866.19 13877.64 1917001.58 1614754.32 393.71 65456.3 18374889.5 1198297.22 723262.7 9837.19 18652.55 31522925.81 19155740.91 2482.09 2444337.47 480270249798.56 41347.93 82864.74 68559.66 62156.77 63986.76 65932.64 1625 5345 216 194 300 505 40 37 241 266 3555 4276 7614.76 18160.31 36149.79 50848.21 52228.94 48169.54 OpenBenchmarking.org
OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash A B 70K 140K 210K 280K 350K 326958 326997 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99 -lnuma
OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB A B 4 8 12 16 20 15.73 14.68 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99 -lnuma
OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB A B 10 20 30 40 50 44.75 44.74 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99 -lnuma
OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe A B 50K 100K 150K 200K 250K 200009 212491 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99 -lnuma
OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi A B 10 20 30 40 50 45 46 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99 -lnuma
OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic A B 2M 4M 6M 8M 10M 7827961 7850909 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99 -lnuma
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU A B 10 20 30 40 50 44.23 43.73 MIN: 35.62 MIN: 36.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU A B 9 18 27 36 45 40.82 41.00 MIN: 38.09 MIN: 36.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU A B 20 40 60 80 100 106.98 106.21 MIN: 87.35 MIN: 85.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU A B 40 80 120 160 200 166.90 159.30 MIN: 163.69 MIN: 156.65 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU
A: The test run did not produce a result.
B: The test run did not produce a result.
Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU
A: The test run did not produce a result.
B: The test run did not produce a result.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU A B 5 10 15 20 25 20.85 11.51 MIN: 4.72 MIN: 4.62 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU A B 14 28 42 56 70 56.44 63.16 MIN: 42.46 MIN: 47.63 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU A B 11 22 33 44 55 49.77 50.96 MIN: 32.58 MIN: 35.78 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU A B 40 80 120 160 200 130.81 163.01 MIN: 126.78 MIN: 155.37 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU A B 20 40 60 80 100 89.07 93.34 MIN: 61.45 MIN: 59.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU A B 8 16 24 32 40 36.19 35.85 MIN: 34.9 MIN: 33.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU A B 4K 8K 12K 16K 20K 16357.1 16750.3 MIN: 13765 MIN: 13295.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU A B 3K 6K 9K 12K 15K 12883.7 11904.0 MIN: 10111.2 MIN: 10460 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU A B 4K 8K 12K 16K 20K 18083 14239 MIN: 14969.5 MIN: 13174.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU
A: The test run did not produce a result.
B: The test run did not produce a result.
Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU
A: The test run did not produce a result.
B: The test run did not produce a result.
Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU
A: The test run did not produce a result.
B: The test run did not produce a result.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU A B 3K 6K 9K 12K 15K 13539.1 10490.5 MIN: 11569.9 MIN: 9172.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU A B 9 18 27 36 45 34.42 37.94 MIN: 30.23 MIN: 33.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU A B 4K 8K 12K 16K 20K 17223.0 13968.6 MIN: 14988 MIN: 12588.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU A B 3K 6K 9K 12K 15K 11352.0 12769.6 MIN: 10383.3 MIN: 10807.2 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU A B 40 80 120 160 200 109.23 168.79 MIN: 103.21 MIN: 160.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=native -fPIC -std=c++11 -pie -ldl -lpthread
Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU
A: The test run did not produce a result.
B: The test run did not produce a result.
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20220422 Encode Settings: Default A B 0.7749 1.5498 2.3247 3.0996 3.8745 3.444 3.423 1. (CXX) g++ options: -fno-rtti -O3 -ldl
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: NUMA A B 8 16 24 32 40 32.37 32.47 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Futex A B 70K 140K 210K 280K 350K 328915.8 309072.4 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: MEMFD A B 300 600 900 1200 1500 1422.26 1427.31 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Atomic A B 0.225 0.45 0.675 0.9 1.125 1 1 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Crypto A B 70K 140K 210K 280K 350K 333734.79 331354.53 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Malloc A B 20M 40M 60M 80M 100M 114773687.93 110879866.19 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Forking A B 3K 6K 9K 12K 15K 14393.70 13877.64 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: IO_uring A B 400K 800K 1200K 1600K 2000K 1937964.93 1917001.58 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: SENDFILE A B 400K 800K 1200K 1600K 2000K 1979249.09 1614754.32 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: CPU Cache A B 90 180 270 360 450 419.89 393.71 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: CPU Stress A B 14K 28K 42K 56K 70K 65472.66 65456.30 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Semaphores A B 4M 8M 12M 16M 20M 18356640.23 18374889.50 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Matrix Math A B 300K 600K 900K 1200K 1500K 1198793.00 1198297.22 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Vector Math A B 150K 300K 450K 600K 750K 723548.66 723262.70 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Memory Copying A B 2K 4K 6K 8K 10K 9674.49 9837.19 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Socket Activity A B 4K 8K 12K 16K 20K 18596.34 18652.55 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Context Switching A B 7M 14M 21M 28M 35M 31806812.55 31522925.81 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Glibc C String Functions A B 4M 8M 12M 16M 20M 19212257.31 19155740.91 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Glibc Qsort Data Sorting A B 500 1000 1500 2000 2500 2477.96 2482.09 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: System V Message Passing A B 500K 1000K 1500K 2000K 2500K 2426671.24 2444337.47 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lsctp -lz -pthread
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 1 A B 9K 18K 27K 36K 45K 41813.43 41347.93 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 20 A B 20K 40K 60K 80K 100K 83090.65 82864.74 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 100 A B 15K 30K 45K 60K 75K 68830.20 68559.66 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 200 A B 13K 26K 39K 52K 65K 61794.37 62156.77 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 500 A B 14K 28K 42K 56K 70K 63406.60 63986.76 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 1000 A B 14K 28K 42K 56K 70K 65624.22 65932.64 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU - Executor: Parallel A B 300 600 900 1200 1500 1596 1625 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU - Executor: Standard A B 1100 2200 3300 4400 5500 4235 5345 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: yolov4 - Device: CPU - Executor: Parallel A B 50 100 150 200 250 214 216 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: yolov4 - Device: CPU - Executor: Standard A B 40 80 120 160 200 203 194 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU - Executor: Parallel A B 70 140 210 280 350 303 300 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU - Executor: Standard A B 110 220 330 440 550 516 505 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel A B 9 18 27 36 45 40 40 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard A B 9 18 27 36 45 38 37 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel A B 50 100 150 200 250 239 241 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard A B 60 120 180 240 300 229 266 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU - Executor: Parallel A B 800 1600 2400 3200 4000 3558 3555 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU - Executor: Standard A B 900 1800 2700 3600 4500 4242 4276 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
Apache HTTP Server This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 1 A B 1600 3200 4800 6400 8000 6546.58 7614.76 1. (CC) gcc options: -shared -fPIC -O2
InfluxDB This is a benchmark of the InfluxDB open-source time-series database optimized for fast, high-availability storage for IoT and other use-cases. The InfluxDB test profile makes use of InfluxDB Inch for facilitating the benchmarks. Learn more via the OpenBenchmarking.org test page.
Concurrent Streams: 4 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000
A: The test quit with a non-zero exit status. E: unable to connect to "http://localhost:8086": Get "http://localhost:8086/ping": dial tcp 127.0.0.1:8086: connect: connection refused
B: The test quit with a non-zero exit status. E: unable to connect to "http://localhost:8086": Get "http://localhost:8086/ping": dial tcp 127.0.0.1:8086: connect: connection refused
Concurrent Streams: 64 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000
A: The test quit with a non-zero exit status. E: unable to connect to "http://localhost:8086": Get "http://localhost:8086/ping": dial tcp 127.0.0.1:8086: connect: connection refused
B: The test quit with a non-zero exit status. E: unable to connect to "http://localhost:8086": Get "http://localhost:8086/ping": dial tcp 127.0.0.1:8086: connect: connection refused
Concurrent Streams: 1024 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000
A: The test quit with a non-zero exit status. E: unable to connect to "http://localhost:8086": Get "http://localhost:8086/ping": dial tcp 127.0.0.1:8086: connect: connection refused
B: The test quit with a non-zero exit status. E: unable to connect to "http://localhost:8086": Get "http://localhost:8086/ping": dial tcp 127.0.0.1:8086: connect: connection refused
A Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Java Notes: OpenJDK Runtime Environment (build 11.0.14+9-Ubuntu-0ubuntu2.22.10)Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 6 May 2022 19:12 by user phoronix.
B Processor: Ampere ARMv8 Neoverse-N1 @ 3.00GHz (256 Cores), Motherboard: WIWYNN Mt.Jade (2.03.20210719 SCP: BIOS), Chipset: Ampere Computing LLC Altra PCI Root Complex A, Memory: 512GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB SAMSUNG MZ1LB960HAJQ-00007, Graphics: ASPEED, Network: Mellanox MT28908 + Intel I210
OS: Ubuntu 21.10, Kernel: 5.13.0-27-generic (aarch64), Display Server: X Server, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Java Notes: OpenJDK Runtime Environment (build 11.0.14+9-Ubuntu-0ubuntu2.22.10)Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 6 May 2022 22:06 by user phoronix.