POWER9 Blackbird

POWER9 testing with a PowerNV C1P9S01 REV 1.01 and ASPEED on Ubuntu 20.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2012211-HA-POWER9BLA00.

POWER9 BlackbirdProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionRun 1Run 2Run 34POWER9 @ 3.80GHz (4 Cores / 16 Threads)PowerNV C1P9S01 REV 1.01128GB1024GB SAMSUNG MZVLB1T0HALR-000L7ASPEED3 x Broadcom NetXtreme BCM5719 PCIeUbuntu 20.105.8.0-29-generic (ppc64le)X Server 1.20.9modesetting 1.20.9GCC 10.2.0ext41024x768OpenBenchmarking.orgCompiler Details- --build=powerpc64le-linux-gnu --disable-multilib --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-xyKMTo/gcc-10-10.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-secureplt --enable-shared --enable-targets=powerpcle-linux --enable-threads=posix --host=powerpc64le-linux-gnu --program-prefix=powerpc64le-linux-gnu- --target=powerpc64le-linux-gnu --with-cpu=power8 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-long-double-128 --with-target-system-zlib=auto --without-cuda-driver -v Disk Details- NONE / relatime,rw / Block Size: 4096Processor Details- SMT (threads per core): 4Python Details- Python 3.8.6Security Details- itlb_multihit: Not affected + l1tf: Mitigation of RFI Flush L1D private per thread + mds: Not affected + meltdown: Mitigation of RFI Flush L1D private per thread + spec_store_bypass: Mitigation of Kernel entry/exit barrier (eieio) + spectre_v1: Mitigation of __user pointer sanitization ori31 speculation barrier enabled + spectre_v2: Mitigation of Indirect branch cache disabled Software link stack flush + srbds: Not affected + tsx_async_abort: Not affected

POWER9 Blackbirdx265: Bosphorus 4Kx265: Bosphorus 1080psqlite-speedtest: Timed Time - Size 1,000build-apache: Time To Compilecompilebench: Compilecompilebench: Initial Createcompilebench: Read Compiled Treeblosc: blosclzcryptopp: All Algorithmscryptopp: Keyed Algorithmscryptopp: Unkeyed Algorithmscryptopp: Integer + Elliptic Curve Public Key Algorithmshpcg: dolfyn: Computational Fluid Dynamicsmafft: Multiple Sequence Alignment - LSU RNAwebp: Defaultwebp: Quality 100webp: Quality 100, Losslesswebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressionsimdjson: Kostyasimdjson: LargeRandsimdjson: PartialTweetssimdjson: DistinctUserIDbyte: Dhrystone 2graphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizinggraphics-magick: Noise-Gaussiangraphics-magick: HWB Color Spaceonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUx264: H.264 Video Encodingcoremark: CoreMark Size 666 - Iterations Per Secondavifenc: 0avifenc: 2avifenc: 8avifenc: 10build-ffmpeg: Time To Compilebuild2: Time To Compilec-ray: Total Time - 4K, 16 Rays Per Pixelsmallpt: Global Illumination Renderer; 128 Samplesbuild-eigen: Time To Compilernnoise: node-web-tooling: basis: ETC1Sbasis: UASTC Level 0basis: UASTC Level 2basis: UASTC Level 3basis: UASTC Level 2 + RDO Post-Processingphpbench: PHP Benchmark Suitebrl-cad: VGR Performance Metricclomp: Static OMP Speedupencode-ape: WAV To APEencode-wavpack: WAV To WavPackRun 1Run 2Run 341.335.52165.44356.7122152.14269.801537.559147.1719.699252297.732663206.5801721927.7448033.2241539.05014.1667.87310.96229.97717.66662.5441.060.481.191.2326948919.614572041543095745447.594956.916879.626387.3204117.8762410.51669.6087433.179211.045113.78068835.135678.469366.734619.540.313569245.634997.273.737713.9583541.7918671461.483815.68030.39826.891198.726474.421216.93239.983128.43938.1164.06107.71114.22591.433185.4281413.839168649257778.921.710129.4591.385.56165.96954.5682140.98269.821534.179138.6719.221267297.712480206.5678511928.6916613.4085238.59014.0937.87210.96029.99817.66562.1091.060.481.191.2326802144.614872242543125845947.054057.173279.476887.4333124.743414.56069.2510434.068210.847113.72668454.834887.468947.334819.240.229668964.134956.673.478813.6884212.9552431458.008809.99330.15126.801199.154471.477213.18339.599129.12138.1274.06104.17214.16690.662183.1501396.336168821265447.822.650129.2541.375.55165.88654.5912145.28269.501534.999113.9718.392075297.699375206.5477451928.6634523.3779140.75414.3997.93310.96429.98517.66262.6831.060.481.191.2326536978.614672341533085745947.315457.146379.532087.0999124.071413.26868.8532433.873210.436113.83468623.435697.969145.535222.440.107668767.934743.773.557913.6483708.5098671459.942809.26730.29726.713196.889468.500211.38139.242128.97238.4684.07103.79614.10189.732181.4751389.465168642266627.922.678129.3231.395.58166.10954.4222142.99270.061535.629121.6719.508168297.686322206.6197871928.8531503.2765538.97314.1137.87110.96229.98117.66262.2031.060.481.191.2326522860.714972142543135845846.916257.325079.420586.7792126.072416.66269.5690434.077210.732113.44167915.134900.768410.434863.639.750968367.134670.773.541213.7284749.705921OpenBenchmarking.org

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4KRun 1Run 2Run 340.31280.62560.93841.25121.564SE +/- 0.01, N = 3SE +/- 0.03, N = 9SE +/- 0.03, N = 9SE +/- 0.03, N = 91.331.381.371.391. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pRun 1Run 2Run 341.25552.5113.76655.0226.2775SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 35.525.565.555.581. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000Run 1Run 2Run 344080120160200SE +/- 0.08, N = 3SE +/- 0.32, N = 3SE +/- 0.12, N = 3SE +/- 0.58, N = 3165.44165.97165.89166.111. (CC) gcc options: -O2 -ldl -lz -lpthread

Timed Apache Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Apache Compilation 2.4.41Time To CompileRun 1Run 2Run 341326395265SE +/- 0.71, N = 3SE +/- 0.65, N = 3SE +/- 0.53, N = 3SE +/- 0.32, N = 356.7154.5754.5954.42

Compile Bench

Test: Compile

OpenBenchmarking.orgMB/s, More Is BetterCompile Bench 0.6Test: CompileRun 1Run 2Run 345001000150020002500SE +/- 2.29, N = 3SE +/- 3.73, N = 3SE +/- 2.29, N = 3SE +/- 2.29, N = 32152.142140.982145.282142.99

Compile Bench

Test: Initial Create

OpenBenchmarking.orgMB/s, More Is BetterCompile Bench 0.6Test: Initial CreateRun 1Run 2Run 3460120180240300SE +/- 0.29, N = 3SE +/- 0.48, N = 3SE +/- 0.43, N = 3SE +/- 0.24, N = 3269.80269.82269.50270.06

Compile Bench

Test: Read Compiled Tree

OpenBenchmarking.orgMB/s, More Is BetterCompile Bench 0.6Test: Read Compiled TreeRun 1Run 2Run 3430060090012001500SE +/- 3.38, N = 3SE +/- 0.00, N = 3SE +/- 0.69, N = 3SE +/- 0.73, N = 31537.551534.171534.991535.62

C-Blosc

Compressor: blosclz

OpenBenchmarking.orgMB/s, More Is BetterC-Blosc 2.0 Beta 5Compressor: blosclzRun 1Run 2Run 342K4K6K8K10KSE +/- 11.02, N = 3SE +/- 30.25, N = 3SE +/- 10.70, N = 3SE +/- 30.46, N = 39147.19138.69113.99121.61. (CXX) g++ options: -rdynamic

Crypto++

Test: All Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: All AlgorithmsRun 1Run 2Run 34160320480640800SE +/- 0.18, N = 3SE +/- 0.64, N = 3SE +/- 0.63, N = 3SE +/- 0.23, N = 3719.70719.22718.39719.511. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe

Crypto++

Test: Keyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Keyed AlgorithmsRun 1Run 2Run 3460120180240300SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 3297.73297.71297.70297.691. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed AlgorithmsRun 1Run 2Run 3450100150200250SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 3206.58206.57206.55206.621. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe

Crypto++

Test: Integer + Elliptic Curve Public Key Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Integer + Elliptic Curve Public Key AlgorithmsRun 1Run 2Run 34400800120016002000SE +/- 0.74, N = 3SE +/- 0.84, N = 3SE +/- 0.90, N = 3SE +/- 0.43, N = 31927.741928.691928.661928.851. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1Run 1Run 2Run 340.76691.53382.30073.06763.8345SE +/- 0.02935, N = 12SE +/- 0.05781, N = 3SE +/- 0.04304, N = 3SE +/- 0.02883, N = 113.224153.408523.377913.276551. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi

Dolfyn

Computational Fluid Dynamics

OpenBenchmarking.orgSeconds, Fewer Is BetterDolfyn 0.527Computational Fluid DynamicsRun 1Run 2Run 34918273645SE +/- 0.42, N = 15SE +/- 0.30, N = 14SE +/- 0.30, N = 3SE +/- 0.47, N = 1539.0538.5940.7538.97

Timed MAFFT Alignment

Multiple Sequence Alignment - LSU RNA

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 7.471Multiple Sequence Alignment - LSU RNARun 1Run 2Run 3448121620SE +/- 0.04, N = 3SE +/- 0.07, N = 3SE +/- 0.07, N = 3SE +/- 0.06, N = 314.1714.0914.4014.111. (CC) gcc options: -std=c99 -O3 -lm -lpthread

WebP Image Encode

Encode Settings: Default

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: DefaultRun 1Run 2Run 34246810SE +/- 0.001, N = 3SE +/- 0.003, N = 3SE +/- 0.050, N = 3SE +/- 0.001, N = 37.8737.8727.9337.8711. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff

WebP Image Encode

Encode Settings: Quality 100

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100Run 1Run 2Run 343691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 310.9610.9610.9610.961. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, LosslessRun 1Run 2Run 34714212835SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 329.9830.0029.9929.981. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest CompressionRun 1Run 2Run 3448121620SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 317.6717.6717.6617.661. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Lossless, Highest CompressionRun 1Run 2Run 341428425670SE +/- 0.23, N = 3SE +/- 0.05, N = 3SE +/- 0.34, N = 3SE +/- 0.06, N = 362.5462.1162.6862.201. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff

simdjson

Throughput Test: Kostya

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.7.1Throughput Test: KostyaRun 1Run 2Run 340.23850.4770.71550.9541.1925SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.061.061.061.061. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: LargeRandom

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.7.1Throughput Test: LargeRandomRun 1Run 2Run 340.1080.2160.3240.4320.54SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.480.480.480.481. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.7.1Throughput Test: PartialTweetsRun 1Run 2Run 340.26780.53560.80341.07121.339SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.191.191.191.191. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.7.1Throughput Test: DistinctUserIDRun 1Run 2Run 340.27680.55360.83041.10721.384SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.231.231.231.231. (CXX) g++ options: -O3 -pthread

BYTE Unix Benchmark

Computational Test: Dhrystone 2

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Dhrystone 2Run 1Run 2Run 346M12M18M24M30MSE +/- 262883.61, N = 3SE +/- 251286.12, N = 3SE +/- 117176.51, N = 3SE +/- 365002.63, N = 326948919.626802144.626536978.626522860.7

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SwirlRun 1Run 2Run 34306090120150SE +/- 2.99, N = 15SE +/- 3.51, N = 12SE +/- 3.39, N = 12SE +/- 3.39, N = 121451481461491. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateRun 1Run 2Run 34160320480640800SE +/- 1.45, N = 3SE +/- 0.33, N = 37207227237211. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenRun 1Run 2Run 341020304050SE +/- 0.43, N = 15SE +/- 0.35, N = 15SE +/- 0.42, N = 15SE +/- 0.43, N = 15414241421. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedRun 1Run 2Run 341224364860SE +/- 0.33, N = 3545453541. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: ResizingRun 1Run 2Run 3470140210280350SE +/- 1.45, N = 3SE +/- 0.88, N = 3SE +/- 0.88, N = 3SE +/- 1.33, N = 33093123083131. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-GaussianRun 1Run 2Run 341326395265SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.58, N = 3575857581. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color SpaceRun 1Run 2Run 34100200300400500SE +/- 1.86, N = 3SE +/- 2.33, N = 3SE +/- 2.03, N = 3SE +/- 2.91, N = 34544594594581. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lX11 -llzma -lxml2 -lz -lm -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPURun 1Run 2Run 341122334455SE +/- 0.11, N = 3SE +/- 0.13, N = 3SE +/- 0.09, N = 3SE +/- 0.06, N = 347.5947.0547.3246.92MIN: 45.76MIN: 45.64MIN: 45.71MIN: 45.691. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPURun 1Run 2Run 341326395265SE +/- 0.10, N = 3SE +/- 0.16, N = 3SE +/- 0.11, N = 3SE +/- 0.09, N = 356.9257.1757.1557.33MIN: 55.71MIN: 55.92MIN: 56.07MIN: 55.681. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPURun 1Run 2Run 3420406080100SE +/- 0.30, N = 3SE +/- 0.25, N = 3SE +/- 0.18, N = 3SE +/- 0.44, N = 379.6379.4879.5379.42MIN: 77.93MIN: 77.83MIN: 77.78MIN: 76.961. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPURun 1Run 2Run 3420406080100SE +/- 0.20, N = 3SE +/- 0.12, N = 3SE +/- 0.13, N = 3SE +/- 0.38, N = 387.3287.4387.1086.78MIN: 83.84MIN: 83.75MIN: 84.47MIN: 83.441. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPURun 1Run 2Run 34306090120150SE +/- 2.51, N = 15SE +/- 1.12, N = 15SE +/- 1.19, N = 15SE +/- 1.46, N = 3117.88124.74124.07126.07MIN: 93.25MIN: 112.86MIN: 114.81MIN: 120.231. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPURun 1Run 2Run 3490180270360450SE +/- 5.17, N = 3SE +/- 5.24, N = 3SE +/- 4.29, N = 3SE +/- 4.17, N = 3410.52414.56413.27416.66MIN: 393.35MIN: 397.03MIN: 397.48MIN: 401.711. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPURun 1Run 2Run 341530456075SE +/- 0.20, N = 3SE +/- 0.29, N = 3SE +/- 0.24, N = 3SE +/- 0.46, N = 369.6169.2568.8569.57MIN: 63.8MIN: 63.21MIN: 63.03MIN: 63.231. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPURun 1Run 2Run 3490180270360450SE +/- 0.87, N = 3SE +/- 0.91, N = 3SE +/- 2.12, N = 3SE +/- 1.51, N = 3433.18434.07433.87434.08MIN: 418.75MIN: 424.16MIN: 418.37MIN: 420.331. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPURun 1Run 2Run 3450100150200250SE +/- 2.20, N = 3SE +/- 2.15, N = 3SE +/- 1.95, N = 3SE +/- 1.74, N = 3211.05210.85210.44210.73MIN: 201.66MIN: 200.86MIN: 201.97MIN: 201.661. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPURun 1Run 2Run 34306090120150SE +/- 0.21, N = 3SE +/- 0.19, N = 3SE +/- 0.22, N = 3SE +/- 0.39, N = 3113.78113.73113.83113.44MIN: 106.16MIN: 105.97MIN: 105.37MIN: 105.641. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPURun 1Run 2Run 3415K30K45K60K75KSE +/- 582.01, N = 3SE +/- 463.43, N = 3SE +/- 530.76, N = 3SE +/- 478.43, N = 368835.168454.868623.467915.1MIN: 67456.4MIN: 67192.4MIN: 67272.9MIN: 66594.11. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPURun 1Run 2Run 348K16K24K32K40KSE +/- 242.26, N = 3SE +/- 50.76, N = 3SE +/- 270.59, N = 3SE +/- 113.42, N = 335678.434887.435697.934900.7MIN: 35267.4MIN: 34715.7MIN: 35018.9MIN: 345511. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPURun 1Run 2Run 3415K30K45K60K75KSE +/- 99.10, N = 3SE +/- 68.18, N = 3SE +/- 126.04, N = 3SE +/- 74.06, N = 369366.768947.369145.568410.4MIN: 68870.8MIN: 68554.2MIN: 68634.5MIN: 68044.11. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPURun 1Run 2Run 348K16K24K32K40KSE +/- 362.11, N = 3SE +/- 125.07, N = 3SE +/- 252.13, N = 3SE +/- 315.84, N = 334619.534819.235222.434863.6MIN: 33664.6MIN: 34373.5MIN: 34568.8MIN: 34209.51. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPURun 1Run 2Run 34918273645SE +/- 0.23, N = 3SE +/- 0.31, N = 3SE +/- 0.08, N = 3SE +/- 0.15, N = 340.3140.2340.1139.75MIN: 38.92MIN: 38.81MIN: 39.08MIN: 38.421. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPURun 1Run 2Run 3415K30K45K60K75KSE +/- 182.68, N = 3SE +/- 127.44, N = 3SE +/- 196.65, N = 3SE +/- 202.82, N = 369245.668964.168767.968367.1MIN: 68610.4MIN: 68500.2MIN: 68180.3MIN: 67752.81. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPURun 1Run 2Run 347K14K21K28K35KSE +/- 70.24, N = 3SE +/- 198.82, N = 3SE +/- 135.08, N = 3SE +/- 115.56, N = 334997.234956.634743.734670.7MIN: 34696.7MIN: 34480.6MIN: 34309.4MIN: 34255.41. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPURun 1Run 2Run 341632486480SE +/- 0.41, N = 3SE +/- 0.27, N = 3SE +/- 0.54, N = 3SE +/- 0.42, N = 373.7473.4873.5673.54MIN: 71.34MIN: 71.33MIN: 71.18MIN: 71.361. (CXX) g++ options: -O3 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2019-12-17H.264 Video EncodingRun 1Run 2Run 3448121620SE +/- 0.15, N = 3SE +/- 0.04, N = 3SE +/- 0.09, N = 3SE +/- 0.04, N = 313.9513.6813.6413.721. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -ffast-math -maltivec -mabi=altivec -mvsx -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per SecondRun 1Run 2Run 3420K40K60K80K100KSE +/- 210.07, N = 3SE +/- 94.63, N = 3SE +/- 98.85, N = 3SE +/- 92.68, N = 383541.7984212.9683708.5184749.711. (CC) gcc options: -O2 -lrt" -lrt

libavif avifenc

Encoder Speed: 0

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.7.3Encoder Speed: 0Run 1Run 2Run 330060090012001500SE +/- 7.82, N = 3SE +/- 5.03, N = 3SE +/- 9.25, N = 31461.481458.011459.941. (CXX) g++ options: -O3 -fPIC

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.7.3Encoder Speed: 2Run 1Run 2Run 32004006008001000SE +/- 4.68, N = 3SE +/- 3.65, N = 3SE +/- 4.04, N = 3815.68809.99809.271. (CXX) g++ options: -O3 -fPIC

libavif avifenc

Encoder Speed: 8

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.7.3Encoder Speed: 8Run 1Run 2Run 3714212835SE +/- 0.11, N = 3SE +/- 0.14, N = 3SE +/- 0.17, N = 330.4030.1530.301. (CXX) g++ options: -O3 -fPIC

libavif avifenc

Encoder Speed: 10

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.7.3Encoder Speed: 10Run 1Run 2Run 3612182430SE +/- 0.07, N = 3SE +/- 0.07, N = 3SE +/- 0.06, N = 326.8926.8026.711. (CXX) g++ options: -O3 -fPIC

Timed FFmpeg Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.2.2Time To CompileRun 1Run 2Run 34080120160200SE +/- 3.04, N = 3SE +/- 2.08, N = 8SE +/- 2.35, N = 6198.73199.15196.89

Build2

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterBuild2 0.13Time To CompileRun 1Run 2Run 3100200300400500SE +/- 1.51, N = 3SE +/- 0.91, N = 3SE +/- 0.85, N = 3474.42471.48468.50

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelRun 1Run 2Run 350100150200250SE +/- 0.50, N = 3SE +/- 0.23, N = 3SE +/- 0.24, N = 3216.93213.18211.381. (CC) gcc options: -lm -lpthread -O3

Smallpt

Global Illumination Renderer; 128 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 128 SamplesRun 1Run 2Run 3918273645SE +/- 0.24, N = 3SE +/- 0.10, N = 3SE +/- 0.16, N = 339.9839.6039.241. (CXX) g++ options: -fopenmp -O3

Timed Eigen Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Eigen Compilation 3.3.9Time To CompileRun 1Run 2Run 3306090120150SE +/- 0.11, N = 3SE +/- 0.77, N = 3SE +/- 0.66, N = 3128.44129.12128.97

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28Run 1Run 2Run 3918273645SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.32, N = 338.1238.1338.471. (CC) gcc options: -O2 -pedantic -fvisibility=hidden

Node.js V8 Web Tooling Benchmark

OpenBenchmarking.orgruns/s, More Is BetterNode.js V8 Web Tooling BenchmarkRun 1Run 2Run 30.91581.83162.74743.66324.579SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 34.064.064.071. Nodejs v12.18.2

Basis Universal

Settings: ETC1S

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: ETC1SRun 1Run 2Run 320406080100SE +/- 1.42, N = 4SE +/- 1.71, N = 3SE +/- 1.70, N = 3107.71104.17103.801. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 0

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 0Run 1Run 2Run 348121620SE +/- 0.02, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 314.2314.1714.101. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 2Run 1Run 2Run 320406080100SE +/- 0.84, N = 10SE +/- 0.77, N = 12SE +/- 0.81, N = 1191.4390.6689.731. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 3Run 1Run 2Run 34080120160200SE +/- 0.90, N = 3SE +/- 0.96, N = 3SE +/- 0.93, N = 3185.43183.15181.481. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2 + RDO Post-Processing

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 2 + RDO Post-ProcessingRun 1Run 2Run 330060090012001500SE +/- 3.25, N = 3SE +/- 1.94, N = 3SE +/- 0.87, N = 31413.841396.341389.471. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

PHPBench

PHP Benchmark Suite

OpenBenchmarking.orgScore, More Is BetterPHPBench 0.8.1PHP Benchmark SuiteRun 1Run 2Run 340K80K120K160K200KSE +/- 49.24, N = 3SE +/- 36.10, N = 3SE +/- 14.53, N = 3168649168821168642

BRL-CAD

VGR Performance Metric

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.30.8VGR Performance MetricRun 1Run 2Run 36K12K18K24K30K2577726544266621. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lpthread -ldl -luuid -lm

CLOMP

Static OMP Speedup

OpenBenchmarking.orgSpeedup, More Is BetterCLOMP 1.2Static OMP SpeedupRun 1Run 2Run 3246810SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.09, N = 38.97.87.91. (CC) gcc options: -fopenmp -O3 -lm

Monkey Audio Encoding

WAV To APE

OpenBenchmarking.orgSeconds, Fewer Is BetterMonkey Audio Encoding 3.99.6WAV To APERun 1Run 2Run 3510152025SE +/- 0.07, N = 5SE +/- 0.24, N = 5SE +/- 0.24, N = 521.7122.6522.681. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt

WavPack Audio Encoding

WAV To WavPack

OpenBenchmarking.orgSeconds, Fewer Is BetterWavPack Audio Encoding 5.3WAV To WavPackRun 1Run 2Run 3306090120150SE +/- 0.10, N = 5SE +/- 0.12, N = 5SE +/- 0.19, N = 5129.46129.25129.321. (CXX) g++ options: -rdynamic


Phoronix Test Suite v10.8.4