EPYC 7502 AOCC 2.3 Compiler Comparison

AMD EPYC 7502 testing of various benchmarks under AMD AOCC 2.3, GCC 10.2, LLVM Clang 11. CFLAGS/CXXFLAGS of "-O3 -march=znver2" throughout. Benchmarks by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2012080-HA-EPYC7502A97
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

AV1 2 Tests
Chess Test Suite 2 Tests
C/C++ Compiler Tests 21 Tests
Compression Tests 2 Tests
CPU Massive 19 Tests
Creator Workloads 17 Tests
Cryptography 2 Tests
Database Test Suite 3 Tests
Encoding 7 Tests
Game Development 2 Tests
HPC - High Performance Computing 6 Tests
Imaging 4 Tests
Common Kernel Benchmarks 3 Tests
Machine Learning 4 Tests
Multi-Core 13 Tests
Programmer / Developer System Benchmarks 2 Tests
Renderers 2 Tests
Server 5 Tests
Server CPU Tests 12 Tests
Single-Threaded 6 Tests
Texture Compression 2 Tests
Video Encoding 6 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
GCC 10.2
December 07 2020
  3 Hours, 49 Minutes
LLVM Clang 11
December 08 2020
  6 Hours, 56 Minutes
AMD AOCC 2.3
December 07 2020
  3 Hours, 58 Minutes
Invert Hiding All Results Option
  4 Hours, 54 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


EPYC 7502 AOCC 2.3 Compiler Comparison - Phoronix Test Suite

EPYC 7502 AOCC 2.3 Compiler Comparison

AMD EPYC 7502 testing of various benchmarks under AMD AOCC 2.3, GCC 10.2, LLVM Clang 11. CFLAGS/CXXFLAGS of "-O3 -march=znver2" throughout. Benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2012080-HA-EPYC7502A97&sgm=1&swl=1&grs&sro.

EPYC 7502 AOCC 2.3 Compiler ComparisonProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionGCC 10.2LLVM Clang 11AMD AOCC 2.3AMD EPYC 7502 32-Core @ 2.50GHz (32 Cores / 64 Threads)ASRockRack EPYCD8 (P2.10 BIOS)AMD Starship/Matisse126GB280GB INTEL SSDPED1D280GAASPEEDAMD Starship/MatisseVE2282 x Intel I350Ubuntu 20.105.8.0-31-generic (x86_64)GNOME Shell 3.38.1X Server 1.20.9modesetting 1.20.9GCC 10.2.0ext41920x1080Clang 11.0.0-2Target:Clang 11.0.0OpenBenchmarking.orgEnvironment Details- CXXFLAGS="-O3 -march=znver2" CFLAGS="-O3 -march=znver2"Compiler Details- GCC 10.2: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - AMD AOCC 2.3: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver2 Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x830101cSecurity Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

EPYC 7502 AOCC 2.3 Compiler Comparisonncnn: CPU - blazefacencnn: CPU - mnasnetc-ray: Total Time - 4K, 16 Rays Per Pixelonednn: Recurrent Neural Network Training - f32 - CPUncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU - efficientnet-b0dav1d: Chimera 1080p 10-bitdaphne: OpenMP - Points2Imagelibraw: Post-Processing Benchmarksvt-av1: Enc Mode 0 - 1080pgraphics-magick: Sharpenopenssl: RSA 4096-bit Performancedaphne: OpenMP - Euclidean Clustersvt-av1: Enc Mode 4 - 1080psvt-av1: Enc Mode 8 - 1080pgraphics-magick: Resizingncnn: CPU - googlenetencode-mp3: WAV To MP3onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: IP Batch 1D - f32 - CPUtnn: CPU - MobileNet v2astcenc: Mediumncnn: CPU - resnet50ncnn: CPU - squeezenetgraphics-magick: Enhancedncnn: CPU - vgg16astcenc: Thoroughwebp: Quality 100, Highest Compressiononednn: Deconvolution Batch deconv_1d - f32 - CPUonednn: IP Batch 1D - u8s8f32 - CPUaobench: 2048 x 2048 - Total Timeonednn: IP Batch All - u8s8f32 - CPUncnn: CPU - mobilenettscp: AI Chess Performancevpxenc: Speed 0astcenc: Exhaustivebasis: UASTC Level 2 + RDO Post-Processingvpxenc: Speed 5ncnn: CPU - resnet18onednn: Deconvolution Batch deconv_1d - u8s8f32 - CPUdaphne: OpenMP - NDT Mappingstockfish: Total Timecompress-lz4: 3 - Compression Speedonednn: Deconvolution Batch deconv_3d - u8s8f32 - CPUsvt-vp9: PSNR/SSIM Optimized - Bosphorus 1080pgraphics-magick: Swirlx265: Bosphorus 4Kwebp: Quality 100, Lossless, Highest Compressionrnnoise: ncnn: CPU - yolov4-tinypgbench: 1 - 50 - Read Only - Average Latencycompress-zstd: 19svt-vp9: Visual Quality Optimized - Bosphorus 1080pcompress-lz4: 1 - Compression Speedscimark2: Compositesqlite-speedtest: Timed Time - Size 1,000mrbayes: Primate Phylogeny Analysispgbench: 1 - 50 - Read Onlyx264: H.264 Video Encodingdav1d: Summer Nature 1080ppgbench: 1 - 1 - Read Write - Average Latencypgbench: 1 - 1 - Read Writecompress-lz4: 9 - Compression Speedsvt-vp9: VMAF Optimized - Bosphorus 1080px265: Bosphorus 1080ppgbench: 1 - 1 - Read Only - Average Latencytjbench: Decompression Throughputcryptopp: Unkeyed Algorithmspgbench: 1 - 1 - Read Onlytnn: CPU - SqueezeNet v1.1nginx: Static Web Page Servingdav1d: Summer Nature 4Kgraphics-magick: Rotatedav1d: Chimera 1080ponednn: Deconvolution Batch deconv_3d - f32 - CPUbasis: UASTC Level 3basis: UASTC Level 2pgbench: 1 - 50 - Read Writepgbench: 1 - 50 - Read Write - Average Latencycompress-zstd: 3webp: Quality 100, Losslesshint: FLOATncnn: CPU - alexnetncnn: CPU - shufflenet-v2redis: SETredis: GETredis: LPUSHonednn: Recurrent Neural Network Inference - f32 - CPUGCC 10.2LLVM Clang 11AMD AOCC 2.33.998.9518.922254.6108.889.3711.32143.0218452.62369482652.540.1034347395.4890.716.77555.713209220.858.7980.5329441.67503324.1727.0023.7017.2959030.899.618.8611.956951.1755635.78313.879319.4610072836.2773.46755.51719.0313.122.17957874.545834738645.571.94380348.24129522.8344.38321.72329.470.096109.9279.739459.692759.3575.14094.284521332149.35567.420.268373944.72354.9849.050.035171.888797305.90041028919305.13430658.67269.72535564.963.6807325.52216.507345314.4887849.620.799291925951.642109.4610.591369757.131809976.831212068.0679.24952.846.2630.646172.5136.876.888.7792.5611946.56671931336.980.1453845412.5674.018.59170.229182718.8810.0780.5758641.72247392.2046.0522.2315.3562736.608.327.6551.693901.0313741.65513.464917.8511436425.9667.17837.54118.2313.192.05539945.326243478448.401.98927365.57132322.4442.37721.54030.700.094111.2286.249838.542673.7977.64293.841530684146.43584.650.265377244.30363.7250.020.035174.991039314.38991428921311.68230676.68275.41527572.193.7235425.32116.485344314.5287866.220.719292874904.5373310.837.351483322.312122749.431304842.2364.99122.195.0133.275147.5235.205.626.94122.3913720.15545910938.470.1463165413.5678.518.65770.698165316.5111.0260.4642351.40879365.3905.8119.7014.3852632.038.167.5461.667721.0092637.75811.923317.0211384426.6566.04833.53620.1211.991.99141923.686237560548.521.87485368.17136823.6942.70420.75629.380.092114.6291.229780.392779.6278.06790.759541314151.92588.620.259386545.76366.6150.600.034176.873794312.63784029645304.17831368.86274.20525575.223.6609625.21016.315341314.6557937.820.904294314450.617068.946.251446872.351874175.911380024.2930.7182OpenBenchmarking.org

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: blazefaceAMD AOCC 2.3GCC 10.2LLVM Clang 110.89781.79562.69343.59124.489SE +/- 0.03, N = 3SE +/- 0.10, N = 3SE +/- 0.02, N = 152.193.992.84-lomp - MIN: 2.1 / MAX: 2.43-lgomp - MIN: 3.77 / MAX: 4.73-lomp - MIN: 2.67 / MAX: 4.671. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: mnasnetAMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.03, N = 3SE +/- 0.29, N = 3SE +/- 0.09, N = 155.018.956.26-lomp - MIN: 4.87 / MAX: 5.39-lgomp - MIN: 8.26 / MAX: 11.03-lomp - MIN: 5.63 / MAX: 8.51. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelAMD AOCC 2.3GCC 10.2LLVM Clang 11816243240SE +/- 0.06, N = 3SE +/- 0.02, N = 3SE +/- 0.08, N = 333.2818.9230.651. (CC) gcc options: -lm -lpthread -O3 -march=znver2

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 1160120180240300SE +/- 0.06, N = 3SE +/- 0.73, N = 3SE +/- 1.10, N = 3147.52254.61172.51-fopenmp=libomp - MIN: 146.31-fopenmp - MIN: 251.67-fopenmp=libomp - MIN: 169.541. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU-v3-v3 - Model: mobilenet-v3AMD AOCC 2.3GCC 10.2LLVM Clang 11246810SE +/- 0.04, N = 3SE +/- 0.11, N = 3SE +/- 0.06, N = 155.208.886.87-lomp - MIN: 5.05 / MAX: 7.61-lgomp - MIN: 8.58 / MAX: 10.83-lomp - MIN: 6.16 / MAX: 8.641. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU-v2-v2 - Model: mobilenet-v2AMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.08, N = 3SE +/- 0.24, N = 3SE +/- 0.06, N = 155.629.376.88-lomp - MIN: 5.35 / MAX: 7.49-lgomp - MIN: 8.62 / MAX: 11-lomp - MIN: 6.35 / MAX: 16.491. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: efficientnet-b0AMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.05, N = 3SE +/- 0.09, N = 3SE +/- 0.09, N = 156.9411.328.77-lomp - MIN: 6.72 / MAX: 9-lgomp - MIN: 11.03 / MAX: 13.16-lomp - MIN: 7.99 / MAX: 13.61. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Chimera 1080p 10-bitAMD AOCC 2.3GCC 10.2LLVM Clang 11306090120150SE +/- 0.34, N = 3SE +/- 0.18, N = 3SE +/- 0.26, N = 3122.39143.0292.56MIN: 85.78 / MAX: 202.39MIN: 98.76 / MAX: 246.17MIN: 61.05 / MAX: 158.661. (CC) gcc options: -O3 -march=znver2 -pthread

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: Points2Image

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: Points2ImageAMD AOCC 2.3GCC 10.2LLVM Clang 114K8K12K16K20KSE +/- 163.63, N = 6SE +/- 140.90, N = 3SE +/- 96.60, N = 313720.1618452.6211946.571. (CXX) g++ options: -O3 -std=c++11 -fopenmp

LibRaw

Post-Processing Benchmark

OpenBenchmarking.orgMpix/sec, More Is BetterLibRaw 0.20Post-Processing BenchmarkAMD AOCC 2.3GCC 10.2LLVM Clang 111224364860SE +/- 0.10, N = 3SE +/- 0.16, N = 3SE +/- 0.13, N = 338.4752.5436.981. (CXX) g++ options: -O3 -march=znver2 -fopenmp -ljpeg -lz -lm

SVT-AV1

Encoder Mode: Enc Mode 0 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 0 - Input: 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 110.03290.06580.09870.13160.1645SE +/- 0.000, N = 3SE +/- 0.000, N = 3SE +/- 0.000, N = 30.1460.1030.1451. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenAMD AOCC 2.3GCC 10.2LLVM Clang 11901802703604503164343841. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit PerformanceAMD AOCC 2.3GCC 10.2LLVM Clang 1116003200480064008000SE +/- 0.64, N = 3SE +/- 0.73, N = 3SE +/- 1.37, N = 35413.57395.45412.5-Qunused-arguments-Qunused-arguments1. (CC) gcc options: -pthread -m64 -O3 -march=znver2 -lssl -lcrypto -ldl

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: Euclidean Cluster

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: Euclidean ClusterAMD AOCC 2.3GCC 10.2LLVM Clang 112004006008001000SE +/- 0.47, N = 3SE +/- 0.77, N = 3SE +/- 2.53, N = 3678.51890.71674.011. (CXX) g++ options: -O3 -std=c++11 -fopenmp

SVT-AV1

Encoder Mode: Enc Mode 4 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 4 - Input: 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 11246810SE +/- 0.042, N = 3SE +/- 0.025, N = 3SE +/- 0.026, N = 38.6576.7758.5911. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-AV1

Encoder Mode: Enc Mode 8 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 8 - Input: 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 111632486480SE +/- 0.46, N = 3SE +/- 0.29, N = 3SE +/- 0.30, N = 370.7055.7170.231. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: ResizingAMD AOCC 2.3GCC 10.2LLVM Clang 11400800120016002000SE +/- 22.27, N = 3SE +/- 17.89, N = 3SE +/- 10.17, N = 31653209218271. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: googlenetAMD AOCC 2.3GCC 10.2LLVM Clang 11510152025SE +/- 0.05, N = 3SE +/- 0.10, N = 3SE +/- 0.27, N = 1516.5120.8518.88-lomp - MIN: 16.26 / MAX: 18.75-lgomp - MIN: 19.8 / MAX: 22.84-lomp - MIN: 16.91 / MAX: 23.311. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3AMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.004, N = 3SE +/- 0.004, N = 3SE +/- 0.003, N = 311.0268.79810.078-ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr1. (CC) gcc options: -O3 -pipe -march=znver2 -lncurses -lm

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.12960.25920.38880.51840.648SE +/- 0.000544, N = 3SE +/- 0.001691, N = 3SE +/- 0.001767, N = 30.4642350.5329440.575864-fopenmp=libomp - MIN: 0.45-fopenmp - MIN: 0.51-fopenmp=libomp - MIN: 0.551. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.38760.77521.16281.55041.938SE +/- 0.00432, N = 3SE +/- 0.00519, N = 3SE +/- 0.00256, N = 31.408791.675031.72247-fopenmp=libomp - MIN: 1.36-fopenmp - MIN: 1.59-fopenmp=libomp - MIN: 1.621. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.2.3Target: CPU - Model: MobileNet v2AMD AOCC 2.3GCC 10.2LLVM Clang 1190180270360450SE +/- 0.24, N = 3SE +/- 0.62, N = 3SE +/- 0.45, N = 3365.39324.17392.20-fopenmp=libomp - MIN: 364.57 / MAX: 366.34-fopenmp - MIN: 311.9 / MAX: 354.48-fopenmp=libomp - MIN: 390.71 / MAX: 393.871. (CXX) g++ options: -O3 -march=znver2 -pthread -fvisibility=hidden -rdynamic -ldl

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: MediumAMD AOCC 2.3GCC 10.2LLVM Clang 11246810SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 35.817.006.051. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: resnet50AMD AOCC 2.3GCC 10.2LLVM Clang 11612182430SE +/- 0.18, N = 3SE +/- 0.16, N = 3SE +/- 0.20, N = 1519.7023.7022.23-lomp - MIN: 19.16 / MAX: 22.41-lgomp - MIN: 23.21 / MAX: 25.68-lomp - MIN: 20.63 / MAX: 31.791. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: squeezenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: squeezenetAMD AOCC 2.3GCC 10.2LLVM Clang 1148121620SE +/- 0.10, N = 3SE +/- 0.12, N = 3SE +/- 0.14, N = 1514.3817.2915.35-lomp - MIN: 13.96 / MAX: 16.87-lgomp - MIN: 16.89 / MAX: 19.32-lomp - MIN: 14.29 / MAX: 18.881. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedAMD AOCC 2.3GCC 10.2LLVM Clang 111402804205607005265906271. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: vgg16AMD AOCC 2.3GCC 10.2LLVM Clang 11816243240SE +/- 0.31, N = 3SE +/- 0.30, N = 3SE +/- 0.42, N = 1532.0330.8936.60-lomp - MIN: 30.86 / MAX: 35.2-lgomp - MIN: 30.04 / MAX: 50.21-lomp - MIN: 32.93 / MAX: 48.081. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: ThoroughAMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 38.169.618.321. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest CompressionAMD AOCC 2.3GCC 10.2LLVM Clang 11246810SE +/- 0.008, N = 3SE +/- 0.003, N = 3SE +/- 0.008, N = 37.5468.8617.6551. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.44030.88061.32091.76122.2015SE +/- 0.00186, N = 3SE +/- 0.01336, N = 3SE +/- 0.01023, N = 31.667721.956951.69390-fopenmp=libomp - MIN: 1.61-fopenmp - MIN: 1.87-fopenmp=libomp - MIN: 1.631. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.26450.5290.79351.0581.3225SE +/- 0.00275, N = 3SE +/- 0.00350, N = 3SE +/- 0.00137, N = 31.009261.175561.03137-fopenmp=libomp - MIN: 0.95-fopenmp - MIN: 1.13-fopenmp=libomp - MIN: 0.971. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total TimeAMD AOCC 2.3GCC 10.2LLVM Clang 111020304050SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 337.7635.7841.661. (CC) gcc options: -lm -O3 -march=znver2

oneDNN

Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 1148121620SE +/- 0.02, N = 3SE +/- 0.06, N = 3SE +/- 0.04, N = 311.9213.8813.46-fopenmp=libomp - MIN: 11.65-fopenmp - MIN: 13.35-fopenmp=libomp - MIN: 13.141. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: mobilenetAMD AOCC 2.3GCC 10.2LLVM Clang 11510152025SE +/- 0.34, N = 3SE +/- 0.12, N = 3SE +/- 0.13, N = 1517.0219.4617.85-lomp - MIN: 16.39 / MAX: 20.11-lgomp - MIN: 18.87 / MAX: 31.76-lomp - MIN: 16.89 / MAX: 21.041. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceAMD AOCC 2.3GCC 10.2LLVM Clang 11200K400K600K800K1000KSE +/- 471.20, N = 5SE +/- 1467.20, N = 5SE +/- 582.00, N = 51138442100728311436421. (CC) gcc options: -O3 -march=znver2 -march=native

VP9 libvpx Encoding

Speed: Speed 0

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 0AMD AOCC 2.3GCC 10.2LLVM Clang 11246810SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.09, N = 36.656.275.961. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver2 -fPIC -U_FORTIFY_SOURCE -std=c++11

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: ExhaustiveAMD AOCC 2.3GCC 10.2LLVM Clang 111632486480SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 366.0473.4667.171. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

Basis Universal

Settings: UASTC Level 2 + RDO Post-Processing

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 2 + RDO Post-ProcessingAMD AOCC 2.3GCC 10.2LLVM Clang 112004006008001000SE +/- 0.04, N = 3SE +/- 0.14, N = 3SE +/- 0.24, N = 3833.54755.52837.541. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

VP9 libvpx Encoding

Speed: Speed 5

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 5AMD AOCC 2.3GCC 10.2LLVM Clang 11510152025SE +/- 0.10, N = 3SE +/- 0.04, N = 3SE +/- 0.07, N = 320.1219.0318.231. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver2 -fPIC -U_FORTIFY_SOURCE -std=c++11

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: resnet18AMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.15, N = 3SE +/- 0.09, N = 3SE +/- 0.12, N = 1511.9913.1213.19-lomp - MIN: 11.71 / MAX: 14.27-lgomp - MIN: 12.79 / MAX: 15.11-lomp - MIN: 12.17 / MAX: 23.531. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.49040.98081.47121.96162.452SE +/- 0.00112, N = 3SE +/- 0.00145, N = 3SE +/- 0.00233, N = 31.991412.179572.05539-fopenmp=libomp - MIN: 1.92-fopenmp - MIN: 2.05-fopenmp=libomp - MIN: 1.961. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: NDT Mapping

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: NDT MappingAMD AOCC 2.3GCC 10.2LLVM Clang 112004006008001000SE +/- 1.07, N = 3SE +/- 2.97, N = 3SE +/- 2.29, N = 3923.68874.54945.321. (CXX) g++ options: -O3 -std=c++11 -fopenmp

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 12Total TimeAMD AOCC 2.3GCC 10.2LLVM Clang 1113M26M39M52M65MSE +/- 379890.17, N = 3SE +/- 872585.44, N = 3SE +/- 877956.02, N = 4623756055834738662434784-flto=thin-flto -flto=jobserver-flto=thin1. (CXX) g++ options: -m64 -lpthread -O3 -march=znver2 -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2

LZ4 Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 3 - Compression SpeedAMD AOCC 2.3GCC 10.2LLVM Clang 111122334455SE +/- 0.00, N = 3SE +/- 0.66, N = 3SE +/- 0.04, N = 348.5245.5748.401. (CC) gcc options: -O3

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.44760.89521.34281.79042.238SE +/- 0.00189, N = 3SE +/- 0.00392, N = 3SE +/- 0.00691, N = 31.874851.943801.98927-fopenmp=libomp - MIN: 1.82-fopenmp - MIN: 1.82-fopenmp=libomp - MIN: 1.91. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 1180160240320400SE +/- 0.54, N = 3SE +/- 1.42, N = 3SE +/- 1.74, N = 3368.17348.24365.571. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SwirlAMD AOCC 2.3GCC 10.2LLVM Clang 1130060090012001500SE +/- 4.26, N = 3SE +/- 1.00, N = 3SE +/- 2.40, N = 31368129513231. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4KAMD AOCC 2.3GCC 10.2LLVM Clang 11612182430SE +/- 0.05, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 323.6922.8322.441. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread -lrt -ldl -lnuma

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Lossless, Highest CompressionAMD AOCC 2.3GCC 10.2LLVM Clang 111020304050SE +/- 0.01, N = 3SE +/- 0.12, N = 3SE +/- 0.12, N = 342.7044.3842.381. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28AMD AOCC 2.3GCC 10.2LLVM Clang 11510152025SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 320.7621.7221.541. (CC) gcc options: -O3 -march=znver2 -pedantic -fvisibility=hidden

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: yolov4-tinyAMD AOCC 2.3GCC 10.2LLVM Clang 11714212835SE +/- 0.15, N = 3SE +/- 0.12, N = 3SE +/- 0.19, N = 1529.3829.4730.70-lomp - MIN: 28.77 / MAX: 31.68-lgomp - MIN: 28.89 / MAX: 31.49-lomp - MIN: 29.08 / MAX: 40.61. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read Only - Average LatencyAMD AOCC 2.3GCC 10.2LLVM Clang 110.02160.04320.06480.08640.108SE +/- 0.000, N = 3SE +/- 0.001, N = 3SE +/- 0.001, N = 30.0920.0960.0941. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

Zstd Compression

Compression Level: 19

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 19AMD AOCC 2.3GCC 10.2LLVM Clang 11306090120150SE +/- 0.26, N = 3SE +/- 0.23, N = 3SE +/- 0.30, N = 3114.6109.9111.21. (CC) gcc options: -O3 -march=znver2 -pthread -lz

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: Visual Quality Optimized - Input: Bosphorus 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 1160120180240300SE +/- 0.90, N = 3SE +/- 1.18, N = 3SE +/- 1.68, N = 3291.22279.73286.241. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedAMD AOCC 2.3GCC 10.2LLVM Clang 112K4K6K8K10KSE +/- 45.16, N = 3SE +/- 56.41, N = 3SE +/- 55.02, N = 39780.399459.699838.541. (CC) gcc options: -O3

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeAMD AOCC 2.3GCC 10.2LLVM Clang 116001200180024003000SE +/- 43.99, N = 3SE +/- 14.97, N = 3SE +/- 7.04, N = 32779.622759.352673.791. (CC) gcc options: -O3 -march=znver2 -lm

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000AMD AOCC 2.3GCC 10.2LLVM Clang 1120406080100SE +/- 0.09, N = 3SE +/- 0.13, N = 3SE +/- 0.18, N = 378.0775.1477.641. (CC) gcc options: -O3 -march=znver2 -ldl -lz -lpthread

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisAMD AOCC 2.3GCC 10.2LLVM Clang 1120406080100SE +/- 0.04, N = 3SE +/- 0.17, N = 3SE +/- 0.04, N = 390.7694.2893.84-mabm1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -O3 -std=c99 -pedantic -march=znver2 -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read OnlyAMD AOCC 2.3GCC 10.2LLVM Clang 11120K240K360K480K600KSE +/- 798.75, N = 3SE +/- 4440.25, N = 3SE +/- 4278.21, N = 35413145213325306841. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2019-12-17H.264 Video EncodingAMD AOCC 2.3GCC 10.2LLVM Clang 11306090120150SE +/- 0.67, N = 3SE +/- 1.24, N = 3SE +/- 0.59, N = 3151.92149.35146.43-mstack-alignment=64-mstack-alignment=641. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=znver2 -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Summer Nature 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 11130260390520650SE +/- 2.09, N = 3SE +/- 0.36, N = 3SE +/- 0.95, N = 3588.62567.42584.65MIN: 345.64 / MAX: 651.08MIN: 337.19 / MAX: 625.71MIN: 337.56 / MAX: 641.351. (CC) gcc options: -O3 -march=znver2 -pthread

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average LatencyAMD AOCC 2.3GCC 10.2LLVM Clang 110.06030.12060.18090.24120.3015SE +/- 0.001, N = 3SE +/- 0.003, N = 5SE +/- 0.000, N = 30.2590.2680.2651. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read WriteAMD AOCC 2.3GCC 10.2LLVM Clang 118001600240032004000SE +/- 18.01, N = 3SE +/- 46.08, N = 5SE +/- 3.59, N = 33865373937721. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

LZ4 Compression

Compression Level: 9 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 9 - Compression SpeedAMD AOCC 2.3GCC 10.2LLVM Clang 111020304050SE +/- 0.03, N = 3SE +/- 0.60, N = 3SE +/- 0.02, N = 345.7644.7244.301. (CC) gcc options: -O3

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: VMAF Optimized - Input: Bosphorus 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 1180160240320400SE +/- 1.39, N = 3SE +/- 2.15, N = 3SE +/- 1.70, N = 3366.61354.98363.721. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 111122334455SE +/- 0.14, N = 3SE +/- 0.13, N = 3SE +/- 0.09, N = 350.6049.0550.021. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread -lrt -ldl -lnuma

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read Only - Average LatencyAMD AOCC 2.3GCC 10.2LLVM Clang 110.00790.01580.02370.03160.0395SE +/- 0.000, N = 3SE +/- 0.000, N = 3SE +/- 0.000, N = 30.0340.0350.0351. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.0.2Test: Decompression ThroughputAMD AOCC 2.3GCC 10.2LLVM Clang 114080120160200SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.21, N = 3176.87171.89174.991. (CC) gcc options: -O3 -march=znver2 -rdynamic

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed AlgorithmsAMD AOCC 2.3GCC 10.2LLVM Clang 1170140210280350SE +/- 0.16, N = 3SE +/- 0.09, N = 3SE +/- 0.20, N = 3312.64305.90314.391. (CXX) g++ options: -O3 -march=znver2 -fPIC -pthread -pipe

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read OnlyAMD AOCC 2.3GCC 10.2LLVM Clang 116K12K18K24K30KSE +/- 73.04, N = 3SE +/- 137.51, N = 3SE +/- 252.07, N = 32964528919289211. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.2.3Target: CPU - Model: SqueezeNet v1.1AMD AOCC 2.3GCC 10.2LLVM Clang 1170140210280350SE +/- 0.66, N = 3SE +/- 0.20, N = 3SE +/- 1.92, N = 3304.18305.13311.68-fopenmp=libomp - MIN: 302.78 / MAX: 315.91-fopenmp - MIN: 304.36 / MAX: 306.06-fopenmp=libomp - MIN: 306.99 / MAX: 314.321. (CXX) g++ options: -O3 -march=znver2 -pthread -fvisibility=hidden -rdynamic -ldl

NGINX Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterNGINX Benchmark 1.9.9Static Web Page ServingAMD AOCC 2.3GCC 10.2LLVM Clang 117K14K21K28K35KSE +/- 254.59, N = 15SE +/- 381.73, N = 4SE +/- 159.74, N = 331368.8630658.6730676.681. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native -march=znver2

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Summer Nature 4KAMD AOCC 2.3GCC 10.2LLVM Clang 1160120180240300SE +/- 1.22, N = 3SE +/- 0.57, N = 3SE +/- 1.08, N = 3274.20269.72275.41MIN: 151.98 / MAX: 293.44MIN: 160.12 / MAX: 288.9MIN: 155.99 / MAX: 295.351. (CC) gcc options: -O3 -march=znver2 -pthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateAMD AOCC 2.3GCC 10.2LLVM Clang 11120240360480600SE +/- 1.00, N = 3SE +/- 5.24, N = 35255355271. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

dav1d

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Chimera 1080pAMD AOCC 2.3GCC 10.2LLVM Clang 11120240360480600SE +/- 0.49, N = 3SE +/- 1.05, N = 3SE +/- 0.97, N = 3575.22564.96572.19MIN: 414.12 / MAX: 729.95MIN: 399.64 / MAX: 724.27MIN: 404.8 / MAX: 726.71. (CC) gcc options: -O3 -march=znver2 -pthread

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.83781.67562.51343.35124.189SE +/- 0.01101, N = 3SE +/- 0.01624, N = 3SE +/- 0.00775, N = 33.660963.680733.72354-fopenmp=libomp - MIN: 3.49-fopenmp - MIN: 3.53-fopenmp=libomp - MIN: 3.571. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 3AMD AOCC 2.3GCC 10.2LLVM Clang 11612182430SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 325.2125.5225.321. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 2AMD AOCC 2.3GCC 10.2LLVM Clang 1148121620SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 316.3216.5116.491. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read WriteAMD AOCC 2.3GCC 10.2LLVM Clang 117001400210028003500SE +/- 5.71, N = 3SE +/- 5.86, N = 3SE +/- 4.43, N = 33413345334431. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read Write - Average LatencyAMD AOCC 2.3GCC 10.2LLVM Clang 1148121620SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 314.6614.4914.531. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

Zstd Compression

Compression Level: 3

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 3AMD AOCC 2.3GCC 10.2LLVM Clang 112K4K6K8K10KSE +/- 3.73, N = 3SE +/- 6.11, N = 3SE +/- 30.93, N = 37937.87849.67866.21. (CC) gcc options: -O3 -march=znver2 -pthread -lz

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, LosslessAMD AOCC 2.3GCC 10.2LLVM Clang 11510152025SE +/- 0.03, N = 3SE +/- 0.08, N = 3SE +/- 0.02, N = 320.9020.8020.721. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

Hierarchical INTegration

Test: FLOAT

OpenBenchmarking.orgQUIPs, More Is BetterHierarchical INTegration 1.0Test: FLOATAMD AOCC 2.3GCC 10.2LLVM Clang 1160M120M180M240M300MSE +/- 30193.74, N = 3SE +/- 15707.07, N = 3SE +/- 170353.62, N = 3294314450.62291925951.64292874904.541. (CC) gcc options: -O3 -march=znver2 -march=native -lm

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: alexnetAMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.01, N = 3SE +/- 0.14, N = 3SE +/- 0.18, N = 158.949.4610.83-lomp - MIN: 8.81 / MAX: 13.49-lgomp - MIN: 9.17 / MAX: 11.49-lomp - MIN: 9.15 / MAX: 60.41. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: shufflenet-v2AMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.12, N = 3SE +/- 0.68, N = 3SE +/- 0.02, N = 156.2510.597.35-lomp - MIN: 5.96 / MAX: 6.53-lgomp - MIN: 9.42 / MAX: 13.72-lomp - MIN: 7.09 / MAX: 11.011. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: SETAMD AOCC 2.3GCC 10.2LLVM Clang 11300K600K900K1200K1500KSE +/- 27170.26, N = 15SE +/- 24810.19, N = 15SE +/- 22282.05, N = 151446872.351369757.131483322.311. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: GETAMD AOCC 2.3GCC 10.2LLVM Clang 11500K1000K1500K2000K2500KSE +/- 30693.11, N = 15SE +/- 18838.90, N = 3SE +/- 49885.99, N = 151874175.911809976.832122749.431. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

Redis

Test: LPUSH

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: LPUSHAMD AOCC 2.3GCC 10.2LLVM Clang 11300K600K900K1200K1500KSE +/- 18763.42, N = 3SE +/- 21030.89, N = 15SE +/- 22719.73, N = 151380024.291212068.061304842.231. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 1120406080100SE +/- 0.32, N = 3SE +/- 0.80, N = 3SE +/- 2.37, N = 1530.7279.2564.99-fopenmp=libomp - MIN: 29.35-fopenmp - MIN: 77.35-fopenmp=libomp - MIN: 50.951. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Geometric Mean Of All Test Results

Result Composite - EPYC 7502 AOCC 2.3 Compiler Comparison

OpenBenchmarking.orgGeometric Mean, More Is BetterGeometric Mean Of All Test ResultsResult Composite - EPYC 7502 AOCC 2.3 Compiler ComparisonAMD AOCC 2.3GCC 10.2LLVM Clang 11306090120150121.37113.07115.39

Number Of First Place Finishes

Wins - 89 Tests

AMD AOCC 2.361 [68.5%]GCC 10.217 [19.1%]LLVM Clang 1111 [12.4%]Number Of First Place FinishesWins - 89 TestsOpenBenchmarking.org

Number Of Last Place Finishes

Losses - 89 Tests

AMD AOCC 2.310 [11.2%]GCC 10.256 [62.9%]LLVM Clang 1123 [25.8%]Number Of Last Place FinishesLosses - 89 TestsOpenBenchmarking.org


Phoronix Test Suite v10.8.4