AMD AOCC 2.2 vs. GCC vs. Clang - EPYC 7742 2P

AMD AOCC 2.2 compiler against GCC 10, LLVM Clang 10. Benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2008278-FI-AMDAOCC2233&grs.

AMD AOCC 2.2 vs. GCC vs. Clang - EPYC 7742 2PProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionAOCC 2.2GCC 10.2Clang 10.12 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads)AMD DAYTONA_X (RDY1006G BIOS)AMD Starship/Matisse504GB3841GB Micron_9300_MTFDHAL3T8TDPASPEEDVE2282 x Mellanox MT27710Ubuntu 20.105.4.0-42-generic (x86_64)GNOME Shell 3.36.4X Server 1.20.8modesetting 1.20.8Clang 10.0.0ext41920x1080GCC 10.2.0Clang 10.0.1-1Target:OpenBenchmarking.orgEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Details- AOCC 2.2: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver2 - GCC 10.2: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-Fb4d6e/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-Fb4d6e/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance - CPU Microcode: 0x8301034Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

AMD AOCC 2.2 vs. GCC vs. Clang - EPYC 7742 2Ponednn: Recurrent Neural Network Training - f32 - CPUbuild-mplayer: Time To Compileonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Deconvolution Batch deconv_1d - f32 - CPUonednn: IP Batch 1D - f32 - CPUbuild-ffmpeg: Time To Compilebuild-apache: Time To Compileonednn: IP Batch All - f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUopenssl: RSA 4096-bit Performancegraphics-magick: Enhancedhmmer: Pfam Database Searchscimark2: Sparse Matrix Multiplyjohn-the-ripper: Blowfishtscp: AI Chess Performancecpp-perf-bench: Stepanov Vectorastcenc: Thoroughastcenc: Fastastcenc: Exhaustiveastcenc: Mediumcpp-perf-bench: Stepanov Abstractionbrl-cad: VGR Performance Metricsvt-vp9: PSNR/SSIM Optimized - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080pcpp-perf-bench: Ctypesvt-vp9: VMAF Optimized - Bosphorus 1080pbullet: 1000 Convexgraphics-magick: Rotatestockfish: Total Timerodinia: OpenMP Leukocytebullet: Convex Trimeshx264: H.264 Video Encodingmrbayes: Primate Phylogeny Analysisbullet: Prim Trimeshleveldb: Hot Readcpp-perf-bench: Math Librarycryptopp: Unkeyed Algorithmsbullet: 3000 Fallscimark2: Monte Carlobasis: UASTC Level 3apache: Static Web Page Servingonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Deconvolution Batch deconv_3d - f32 - CPUgraphics-magick: Resizingcompress-zstd: 19rodinia: OpenMP StreamclusterAOCC 2.2GCC 10.2Clang 10.1162.41735.8770.2285720.9896250.81227939.87141.24712.67650.50684718574.811085.6083450.97176970116395090.4958.044.9919.955.6734.0072693129412.51336.1940.209392.644.55785752925482474250.1401.183395204.60106.5841.00944286.165334.752317.3793074.310990620.8913.73527488.3661.86502.57847351128.310.017870.72010.7030.7377222.826881.9838016.44423.21118.86960.74408824437.614396.1012811.30149154101432899.1919.095.6122.006.2536.9712951330382.50311.8543.269376.614.87369849724209816552.0741.23393204.52109.5891.039175289.700339.961310.1432054.313341611.7213.75626062.36351.2282.98436107125.710.005219.26618.0860.2723801.057690.92561621.81721.73912.91230.53703718618.013274.8883462.03177644117889886.1218.115.1719.715.7333.5502779511388.72321.8440.336366.544.55877152725700811452.7081.176618195.51111.3521.021595293.718331.619316.5393344.387945620.6413.56926903.4987.59762.62134162120.710.713OpenBenchmarking.org

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAOCC 2.2GCC 10.2Clang 10.12004006008001000SE +/- 2.17, N = 4SE +/- 4.49, N = 3SE +/- 1.11, N = 3162.42870.72219.27-fopenmp=libomp - MIN: 145.25-fopenmp - MIN: 830.55-fopenmp=libomp - MIN: 201.21. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Timed MPlayer Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MPlayer Compilation 1.4Time To CompileAOCC 2.2GCC 10.2Clang 10.1816243240SE +/- 0.20, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 335.8810.7018.09

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAOCC 2.2GCC 10.2Clang 10.10.1660.3320.4980.6640.83SE +/- 0.003245, N = 4SE +/- 0.002686, N = 3SE +/- 0.002288, N = 30.2285720.7377220.272380-fopenmp=libomp - MIN: 0.18-fopenmp - MIN: 0.66-fopenmp=libomp - MIN: 0.221. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPUAOCC 2.2GCC 10.2Clang 10.10.6361.2721.9082.5443.18SE +/- 0.010667, N = 3SE +/- 0.012408, N = 3SE +/- 0.004959, N = 30.9896252.8268801.057690-fopenmp=libomp - MIN: 0.73-fopenmp - MIN: 2.5-fopenmp=libomp - MIN: 0.911. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: f32 - Engine: CPUAOCC 2.2GCC 10.2Clang 10.10.44640.89281.33921.78562.232SE +/- 0.008809, N = 3SE +/- 0.003268, N = 3SE +/- 0.006045, N = 30.8122791.9838000.925616-fopenmp=libomp - MIN: 0.71-fopenmp - MIN: 1.79-fopenmp=libomp - MIN: 0.831. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Timed FFmpeg Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.2.2Time To CompileAOCC 2.2GCC 10.2Clang 10.1918273645SE +/- 0.19, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 339.8716.4421.82

Timed Apache Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Apache Compilation 2.4.41Time To CompileAOCC 2.2GCC 10.2Clang 10.1918273645SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 341.2523.2121.74

oneDNN

Harness: IP Batch All - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch All - Data Type: f32 - Engine: CPUAOCC 2.2GCC 10.2Clang 10.1510152025SE +/- 0.06, N = 3SE +/- 0.21, N = 7SE +/- 0.09, N = 312.6818.8712.91-fopenmp=libomp - MIN: 11.46-fopenmp - MIN: 15.39-fopenmp=libomp - MIN: 11.831. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAOCC 2.2GCC 10.2Clang 10.10.16740.33480.50220.66960.837SE +/- 0.003081, N = 3SE +/- 0.002043, N = 3SE +/- 0.002740, N = 30.5068470.7440880.537037-fopenmp=libomp - MIN: 0.44-fopenmp - MIN: 0.66-fopenmp=libomp - MIN: 0.471. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit PerformanceAOCC 2.2GCC 10.2Clang 10.15K10K15K20K25KSE +/- 169.89, N = 3SE +/- 258.78, N = 3SE +/- 111.56, N = 318574.824437.618618.0-Qunused-arguments-Qunused-arguments1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedAOCC 2.2GCC 10.2Clang 10.130060090012001500SE +/- 0.58, N = 3SE +/- 6.43, N = 3SE +/- 5.69, N = 31108143913271. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchAOCC 2.2GCC 10.2Clang 10.1246810SE +/- 0.078, N = 3SE +/- 0.081, N = 3SE +/- 0.073, N = 35.6086.1014.8881. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyAOCC 2.2GCC 10.2Clang 10.17001400210028003500SE +/- 7.98, N = 3SE +/- 28.45, N = 3SE +/- 15.20, N = 33450.972811.303462.031. (CC) gcc options: -O3 -march=native -lm

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: BlowfishAOCC 2.2GCC 10.2Clang 10.140K80K120K160K200KSE +/- 1313.25, N = 3SE +/- 326.24, N = 3SE +/- 1042.76, N = 31769701491541776441. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt -lbz2

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceAOCC 2.2GCC 10.2Clang 10.1300K600K900K1200K1500KSE +/- 1433.03, N = 5SE +/- 1024.97, N = 5SE +/- 1470.12, N = 51163950101432811788981. (CC) gcc options: -O3 -march=native

CppPerformanceBenchmarks

Test: Stepanov Vector

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov VectorAOCC 2.2GCC 10.2Clang 10.120406080100SE +/- 0.12, N = 3SE +/- 0.05, N = 3SE +/- 0.14, N = 390.5099.1986.121. (CXX) g++ options: -O3 -march=native -std=c++11

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: ThoroughAOCC 2.2GCC 10.2Clang 10.13691215SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 38.049.098.111. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

ASTC Encoder

Preset: Fast

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: FastAOCC 2.2GCC 10.2Clang 10.11.26232.52463.78695.04926.3115SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.04, N = 34.995.615.171. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: ExhaustiveAOCC 2.2GCC 10.2Clang 10.1510152025SE +/- 0.21, N = 3SE +/- 0.12, N = 3SE +/- 0.03, N = 319.9522.0019.711. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: MediumAOCC 2.2GCC 10.2Clang 10.1246810SE +/- 0.01, N = 3SE +/- 0.04, N = 3SE +/- 0.05, N = 35.676.255.731. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

CppPerformanceBenchmarks

Test: Stepanov Abstraction

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov AbstractionAOCC 2.2GCC 10.2Clang 10.1918273645SE +/- 0.13, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 334.0136.9733.551. (CXX) g++ options: -O3 -march=native -std=c++11

BRL-CAD

VGR Performance Metric

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.30.8VGR Performance MetricAOCC 2.2GCC 10.2Clang 10.1600K1200K1800K2400K3000K269312929513302779511-flto -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender1. (CXX) g++ options: -O3 -march=native -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -Qunused-arguments -finline-functions -pedantic -rdynamic -lpthread -ldl -luuid -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080pAOCC 2.2GCC 10.2Clang 10.190180270360450SE +/- 5.71, N = 15SE +/- 4.16, N = 3SE +/- 4.72, N = 3412.51382.50388.721. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: Visual Quality Optimized - Input: Bosphorus 1080pAOCC 2.2GCC 10.2Clang 10.170140210280350SE +/- 3.05, N = 3SE +/- 5.02, N = 3SE +/- 3.32, N = 3336.19311.85321.841. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

CppPerformanceBenchmarks

Test: Ctype

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: CtypeAOCC 2.2GCC 10.2Clang 10.11020304050SE +/- 0.01, N = 3SE +/- 0.08, N = 3SE +/- 0.00, N = 340.2143.2740.341. (CXX) g++ options: -O3 -march=native -std=c++11

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: VMAF Optimized - Input: Bosphorus 1080pAOCC 2.2GCC 10.2Clang 10.190180270360450SE +/- 3.17, N = 3SE +/- 5.95, N = 12SE +/- 4.02, N = 3392.64376.61366.541. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

Bullet Physics Engine

Test: 1000 Convex

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 ConvexAOCC 2.2GCC 10.2Clang 10.11.09662.19323.28984.38645.483SE +/- 0.004048, N = 3SE +/- 0.010129, N = 3SE +/- 0.007811, N = 34.5578574.8736984.558771-lglut -lGL -lGLU1. (CXX) g++ options: -O3 -march=native -rdynamic

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateAOCC 2.2GCC 10.2Clang 10.1110220330440550SE +/- 5.25, N = 15SE +/- 4.04, N = 3SE +/- 7.36, N = 45294975271. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 9Total TimeAOCC 2.2GCC 10.2Clang 10.160M120M180M240M300MSE +/- 254505.98, N = 3SE +/- 1656059.78, N = 3SE +/- 1092575.32, N = 32548247422420981652570081141. (CXX) g++ options: -m64 -lpthread -O3 -march=native -fno-exceptions -std=c++11 -pedantic -msse -msse3 -mpopcnt -flto

Rodinia

Test: OpenMP Leukocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LeukocyteAOCC 2.2GCC 10.2Clang 10.11224364860SE +/- 0.25, N = 3SE +/- 0.62, N = 6SE +/- 0.59, N = 350.1452.0752.71-O3 -fopenmp-O2 -lOpenCL-O3 -fopenmp1. (CXX) g++ options:

Bullet Physics Engine

Test: Convex Trimesh

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Convex TrimeshAOCC 2.2GCC 10.2Clang 10.10.27760.55520.83281.11041.388SE +/- 0.006371, N = 3SE +/- 0.001754, N = 3SE +/- 0.000573, N = 31.1833951.2339301.176618-lglut -lGL -lGLU1. (CXX) g++ options: -O3 -march=native -rdynamic

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2019-12-17H.264 Video EncodingAOCC 2.2GCC 10.2Clang 10.14080120160200SE +/- 1.48, N = 3SE +/- 1.50, N = 3SE +/- 1.46, N = 3204.60204.52195.51-mstack-alignment=64-mstack-alignment=641. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -m64 -lm -lpthread -O3 -ffast-math -march=native -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisAOCC 2.2GCC 10.2Clang 10.120406080100SE +/- 0.46, N = 3SE +/- 0.39, N = 3SE +/- 0.14, N = 3106.58109.59111.35-mabm1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -O3 -std=c99 -pedantic -march=native -lm

Bullet Physics Engine

Test: Prim Trimesh

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Prim TrimeshAOCC 2.2GCC 10.2Clang 10.10.23380.46760.70140.93521.169SE +/- 0.005065, N = 3SE +/- 0.001394, N = 3SE +/- 0.000517, N = 31.0094401.0391751.021595-lglut -lGL -lGLU1. (CXX) g++ options: -O3 -march=native -rdynamic

LevelDB

Benchmark: Hot Read

OpenBenchmarking.orgMicroseconds Per Op, Fewer Is BetterLevelDB 1.22Benchmark: Hot ReadAOCC 2.2GCC 10.2Clang 10.160120180240300SE +/- 2.66, N = 3SE +/- 3.91, N = 3SE +/- 1.28, N = 3286.17289.70293.721. (CXX) g++ options: -O3 -march=native -lsnappy -lpthread

CppPerformanceBenchmarks

Test: Math Library

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Math LibraryAOCC 2.2GCC 10.2Clang 10.170140210280350SE +/- 0.19, N = 3SE +/- 0.95, N = 3SE +/- 0.29, N = 3334.75339.96331.621. (CXX) g++ options: -O3 -march=native -std=c++11

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed AlgorithmsAOCC 2.2GCC 10.2Clang 10.170140210280350SE +/- 0.13, N = 3SE +/- 0.12, N = 3SE +/- 0.74, N = 3317.38310.14316.541. (CXX) g++ options: -O3 -march=native -fPIC -pthread -pipe

Bullet Physics Engine

Test: 3000 Fall

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 3000 FallAOCC 2.2GCC 10.2Clang 10.10.98731.97462.96193.94924.9365SE +/- 0.013109, N = 3SE +/- 0.009658, N = 3SE +/- 0.006058, N = 34.3109904.3133414.387945-lglut -lGL -lGLU1. (CXX) g++ options: -O3 -march=native -rdynamic

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloAOCC 2.2GCC 10.2Clang 10.1130260390520650SE +/- 0.09, N = 3SE +/- 0.39, N = 3SE +/- 0.22, N = 3620.89611.72620.641. (CC) gcc options: -O3 -march=native -lm

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 3AOCC 2.2GCC 10.2Clang 10.148121620SE +/- 0.05, N = 3SE +/- 0.07, N = 3SE +/- 0.09, N = 313.7413.7613.571. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Apache Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.29Static Web Page ServingAOCC 2.2GCC 10.2Clang 10.16K12K18K24K30KSE +/- 575.36, N = 12SE +/- 608.38, N = 15SE +/- 534.75, N = 1527488.3626062.3626903.491. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAOCC 2.2GCC 10.2Clang 10.180160240320400SE +/- 1.82, N = 15SE +/- 2.11, N = 3SE +/- 1.96, N = 1561.87351.2387.60-fopenmp=libomp - MIN: 34.47-fopenmp - MIN: 333.05-fopenmp=libomp - MIN: 61.411. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPUAOCC 2.2GCC 10.2Clang 10.10.67151.3432.01452.6863.3575SE +/- 0.00531, N = 3SE +/- 0.07819, N = 12SE +/- 0.00761, N = 32.578472.984362.62134-fopenmp=libomp - MIN: 2.34-fopenmp - MIN: 2.36-fopenmp=libomp - MIN: 2.421. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: ResizingAOCC 2.2GCC 10.2Clang 10.180160240320400SE +/- 24.14, N = 15SE +/- 1.33, N = 3SE +/- 7.14, N = 123511071621. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

Zstd Compression

Compression Level: 19

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 19AOCC 2.2GCC 10.2Clang 10.1306090120150SE +/- 1.67, N = 3SE +/- 1.67, N = 3SE +/- 4.40, N = 12128.3125.7120.71. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Rodinia

Test: OpenMP Streamcluster

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP StreamclusterAOCC 2.2GCC 10.2Clang 10.13691215SE +/- 0.11, N = 15SE +/- 0.18, N = 12SE +/- 0.13, N = 510.0210.0110.71-O3 -fopenmp-O2 -lOpenCL-O3 -fopenmp1. (CXX) g++ options:


Phoronix Test Suite v10.8.4