AMD AOCC 3.1 Compiler Benchmarking

AMD EPYC 7543 testing of AMD AOCC 3.1 compiler benchmarks by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/2107228-IB-AOCC31BEN66.

AMD AOCC 3.1 Compiler BenchmarkingProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionAOCC 3.0AOCC 3.1AMD EPYC 7543 32-Core @ 2.80GHz (32 Cores / 64 Threads)TYAN S8036GM2NE-LE (V2.00.B21 BIOS)AMD Starship/Matisse64GB1000GB Corsair Force MP600ASPEEDVE2282 x Broadcom NetXtreme BCM5720 2-port PCIeUbuntu 21.045.11.0-25-generic (x86_64)GNOME Shell 3.38.4X ServerClang 12.0.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=znver3" CFLAGS="-O3 -march=znver3"Compiler Details- AOCC 3.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: (unknown) - AOCC 3.1: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3 Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119 Python Details- Python 3.9.5Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

AMD AOCC 3.1 Compiler Benchmarkingblosc: blosclzquantlib: etcpak: DXT1etcpak: ETC2compress-lz4: 1 - Compression Speedcompress-zstd: 8 - Compression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedjpegxl: PNG - 7jpegxl: JPEG - 7jpegxl: JPEG - 8botan: AES-256botan: AES-256 - Decryptbotan: Twofishbotan: Twofish - Decryptbotan: Blowfishbotan: Blowfish - Decryptbotan: CAST-256botan: CAST-256 - Decryptlibraw: Post-Processing Benchmarkjohn-the-ripper: Blowfishjohn-the-ripper: MD5graphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Noise-Gaussiangraphics-magick: HWB Color Spacesvt-av1: Preset 4 - Bosphorus 4Ksvt-av1: Preset 8 - Bosphorus 4Ksvt-hevc: 7 - Bosphorus 1080psvt-hevc: 10 - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080pvpxenc: Speed 5 - Bosphorus 4Khimeno: Poisson Pressure Solverstockfish: Total Timeavifenc: 2avifenc: 6avifenc: 10avifenc: 6, Losslessavifenc: 10, Losslesspovray: Trace Timeonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUencode-flac: WAV To FLACencode-mp3: WAV To MP3ngspice: C2670ngspice: C7552rnnoise: tachyon: Total Timesynthmark: VoiceMark_100securemark: SecureMark-TLSliquid-dsp: 16 - 256 - 57liquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 57financebench: Repo OpenMPfinancebench: Bonds OpenMPtjbench: Decompression Throughputastcenc: Exhaustivesqlite-speedtest: Timed Time - Size 1,000draco: Liondraco: Church Facadencnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - vgg16ncnn: CPU - resnet18tnn: CPU - SqueezeNet v2tnn: CPU - SqueezeNet v1.1rocksdb: Update Randrocksdb: Read While Writingrocksdb: Read Rand Write Randonnx: bertsquad-10 - OpenMP CPUonnx: fcn-resnet101-11 - OpenMP CPUonnx: super-resolution-10 - OpenMP CPUencode-wavpack: WAV To WavPackgnupg: 2.7GB Sample File EncryptionAOCC 3.0AOCC 3.125045.52857.62799.898188.81510359.142752.342.43270.09.0164.2327.444869.9204866.532332.451339.970417.290404.413134.375137.88444.54618932237667792357553422667292.33399.61234.6615.613779.4782308795229024.5639.7313.54226.7255.84217.0141.395524.979052.117151.797193.085896386.24665.9740.4343938.6977.87792.52583.12018.65828.8937622.2272754887974766671507633333189183333333706.02343851304.153646222.01488321.9517530370445.795.7361.7912.2159.511262.3568201886048186297082947486475913.16367.74125069.62854.92897.284207.55010964.112852.442.13264.49.0465.8928.304859.6604876.955332.808339.974417.045405.311134.367137.90745.116184222400007923576964206771.98719.467303.75408.51234.1815.623819.9854469044803024.7679.6723.49726.6665.80014.7791.401645.075471.406741.800963.131866432.49668.7890.4359998.7007.82891.52282.27918.73328.6622621.6292799318065100001527933333189496666733763.57812551185.485677220.67107921.893855.235496064885.745.6051.6011.8860.655266.2418326516135560305906848790456813.17567.609OpenBenchmarking.org

C-Blosc

Compressor: blosclz

OpenBenchmarking.orgMB/s, More Is BetterC-Blosc 2.0Compressor: blosclzAOCC 3.0AOCC 3.15K10K15K20K25KSE +/- 83.54, N = 3SE +/- 100.31, N = 325045.525069.61. (CXX) g++ options: -O3 -march=znver3

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21AOCC 3.0AOCC 3.16001200180024003000SE +/- 3.82, N = 3SE +/- 5.98, N = 32857.62854.91. (CXX) g++ options: -O3 -march=native -rdynamic

Etcpak

Configuration: DXT1

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: DXT1AOCC 3.0AOCC 3.16001200180024003000SE +/- 10.86, N = 3SE +/- 4.90, N = 32799.902897.281. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

Etcpak

Configuration: ETC2

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: ETC2AOCC 3.0AOCC 3.150100150200250SE +/- 0.03, N = 3SE +/- 0.03, N = 3188.82207.551. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedAOCC 3.0AOCC 3.12K4K6K8K10KSE +/- 12.79, N = 3SE +/- 7.15, N = 310359.1410964.111. (CC) gcc options: -O3

Zstd Compression

Compression Level: 8 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Compression SpeedAOCC 3.0AOCC 3.16001200180024003000SE +/- 25.08, N = 7SE +/- 17.26, N = 32752.32852.41. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression SpeedAOCC 3.0AOCC 3.11020304050SE +/- 0.52, N = 4SE +/- 0.38, N = 342.442.11. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression SpeedAOCC 3.0AOCC 3.17001400210028003500SE +/- 5.68, N = 4SE +/- 6.78, N = 33270.03264.41. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

JPEG XL

Input: PNG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: PNG - Encode Speed: 7AOCC 3.0AOCC 3.13691215SE +/- 0.04, N = 3SE +/- 0.01, N = 39.019.041. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: JPEG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 7AOCC 3.0AOCC 3.11530456075SE +/- 0.27, N = 3SE +/- 0.18, N = 364.2365.891. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 8AOCC 3.0AOCC 3.1714212835SE +/- 0.10, N = 3SE +/- 0.04, N = 327.4428.301. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

Botan

Test: AES-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256AOCC 3.0AOCC 3.110002000300040005000SE +/- 1.46, N = 3SE +/- 3.96, N = 34869.924859.661. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: AES-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256 - DecryptAOCC 3.0AOCC 3.110002000300040005000SE +/- 4.76, N = 3SE +/- 6.85, N = 34866.534876.961. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: TwofishAOCC 3.0AOCC 3.170140210280350SE +/- 0.03, N = 3SE +/- 0.05, N = 3332.45332.811. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - DecryptAOCC 3.0AOCC 3.170140210280350SE +/- 0.05, N = 3SE +/- 0.13, N = 3339.97339.971. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: BlowfishAOCC 3.0AOCC 3.190180270360450SE +/- 0.20, N = 3SE +/- 0.17, N = 3417.29417.051. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfish - DecryptAOCC 3.0AOCC 3.190180270360450SE +/- 0.14, N = 3SE +/- 0.11, N = 3404.41405.311. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256AOCC 3.0AOCC 3.1306090120150SE +/- 0.02, N = 3SE +/- 0.02, N = 3134.38134.371. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - DecryptAOCC 3.0AOCC 3.1306090120150SE +/- 0.01, N = 3SE +/- 0.02, N = 3137.88137.911. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

LibRaw

Post-Processing Benchmark

OpenBenchmarking.orgMpix/sec, More Is BetterLibRaw 0.20Post-Processing BenchmarkAOCC 3.0AOCC 3.11020304050SE +/- 0.22, N = 3SE +/- 0.48, N = 344.5445.111. (CXX) g++ options: -O3 -march=znver3 -fopenmp -ljpeg -lz -lm

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: BlowfishAOCC 3.0AOCC 3.113K26K39K52K65KSE +/- 194.70, N = 3SE +/- 125.66, N = 361893618421. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: MD5AOCC 3.0AOCC 3.1500K1000K1500K2000K2500KSE +/- 8875.68, N = 3SE +/- 23259.41, N = 3223766722400001. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateAOCC 3.0AOCC 3.12004006008001000SE +/- 5.24, N = 3SE +/- 3.71, N = 37927921. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenAOCC 3.0AOCC 3.180160240320400SE +/- 0.33, N = 33573571. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedAOCC 3.0AOCC 3.1150300450600750SE +/- 0.67, N = 3SE +/- 0.33, N = 35536961. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-GaussianAOCC 3.0AOCC 3.190180270360450SE +/- 0.67, N = 3SE +/- 1.53, N = 34224201. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color SpaceAOCC 3.0AOCC 3.1150300450600750SE +/- 1.73, N = 3SE +/- 1.73, N = 36676771. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 4 - Input: Bosphorus 4KAOCC 3.10.44710.89421.34131.78842.2355SE +/- 0.001, N = 31.9871. (CXX) g++ options: -O3 -march=znver3 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 8 - Input: Bosphorus 4KAOCC 3.1510152025SE +/- 0.10, N = 319.471. (CXX) g++ options: -O3 -march=znver3 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pAOCC 3.0AOCC 3.170140210280350SE +/- 3.36, N = 3SE +/- 4.16, N = 3292.33303.751. (CC) gcc options: -O3 -march=znver3 -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pAOCC 3.0AOCC 3.190180270360450SE +/- 5.37, N = 3SE +/- 5.79, N = 15399.61408.511. (CC) gcc options: -O3 -march=znver3 -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 1080pAOCC 3.0AOCC 3.150100150200250SE +/- 0.29, N = 3SE +/- 1.81, N = 3234.66234.181. (CC) gcc options: -O3 -fcommon -march=znver3 -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

VP9 libvpx Encoding

Speed: Speed 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 4KAOCC 3.0AOCC 3.148121620SE +/- 0.15, N = 6SE +/- 0.12, N = 315.6115.621. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverAOCC 3.0AOCC 3.18001600240032004000SE +/- 45.95, N = 15SE +/- 26.55, N = 33779.483819.991. (CC) gcc options: -O3 -march=znver3 -mavx2

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total TimeAOCC 3.0AOCC 3.120M40M60M80M100MSE +/- 682844.99, N = 10SE +/- 655436.02, N = 1587952290904480301. (CXX) g++ options: -fprofile-use -m64 -lpthread -O3 -march=znver3 -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -flto

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 2AOCC 3.0AOCC 3.1612182430SE +/- 0.10, N = 3SE +/- 0.25, N = 324.5624.771. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 6AOCC 3.0AOCC 3.13691215SE +/- 0.040, N = 3SE +/- 0.067, N = 39.7319.6721. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 10

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 10AOCC 3.0AOCC 3.10.7971.5942.3913.1883.985SE +/- 0.020, N = 3SE +/- 0.022, N = 33.5423.4971. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 6, LosslessAOCC 3.0AOCC 3.1612182430SE +/- 0.25, N = 3SE +/- 0.14, N = 326.7326.671. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 10, LosslessAOCC 3.0AOCC 3.11.31452.6293.94355.2586.5725SE +/- 0.032, N = 3SE +/- 0.053, N = 35.8425.8001. (CXX) g++ options: -O3 -fPIC -lm

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-Ray 3.7.0.7Trace TimeAOCC 3.0AOCC 3.148121620SE +/- 0.03, N = 3SE +/- 0.04, N = 317.0114.781. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -march=znver3 -pthread -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUAOCC 3.0AOCC 3.10.31540.63080.94621.26161.577SE +/- 0.00351, N = 3SE +/- 0.00055, N = 31.395521.40164MIN: 1.27MIN: 1.271. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUAOCC 3.0AOCC 3.11.1422.2843.4264.5685.71SE +/- 0.00392, N = 3SE +/- 0.00433, N = 34.979055.07547MIN: 4.88MIN: 4.961. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAOCC 3.0AOCC 3.10.47640.95281.42921.90562.382SE +/- 0.01448, N = 13SE +/- 0.01062, N = 32.117151.40674MIN: 1.89MIN: 1.251. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUAOCC 3.0AOCC 3.10.40520.81041.21561.62082.026SE +/- 0.00278, N = 3SE +/- 0.00353, N = 31.797191.80096MIN: 1.57MIN: 1.571. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUAOCC 3.0AOCC 3.10.70471.40942.11412.81883.5235SE +/- 0.00522, N = 3SE +/- 0.00672, N = 33.085893.13186MIN: 2.78MIN: 2.761. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAOCC 3.0AOCC 3.114002800420056007000SE +/- 48.40, N = 15SE +/- 68.78, N = 36386.246432.49MIN: 5873.51MIN: 6265.761. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAOCC 3.0AOCC 3.1140280420560700SE +/- 0.22, N = 3SE +/- 0.23, N = 3665.97668.79MIN: 659.9MIN: 659.591. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAOCC 3.0AOCC 3.10.09810.19620.29430.39240.4905SE +/- 0.000098, N = 3SE +/- 0.000575, N = 30.4343930.435999MIN: 0.4MIN: 0.391. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLACAOCC 3.0AOCC 3.1246810SE +/- 0.004, N = 5SE +/- 0.002, N = 58.6978.7001. (CXX) g++ options: -O3 -march=znver3 -logg -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3AOCC 3.0AOCC 3.1246810SE +/- 0.031, N = 3SE +/- 0.003, N = 37.8777.8281. (CC) gcc options: -O3 -pipe -march=znver3 -lm

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670AOCC 3.0AOCC 3.120406080100SE +/- 0.11, N = 3SE +/- 0.15, N = 392.5391.521. (CC) gcc options: -O3 -march=znver3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552AOCC 3.0AOCC 3.120406080100SE +/- 0.33, N = 3SE +/- 0.29, N = 383.1282.281. (CC) gcc options: -O3 -march=znver3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28AOCC 3.0AOCC 3.1510152025SE +/- 0.02, N = 3SE +/- 0.05, N = 318.6618.731. (CC) gcc options: -O3 -march=znver3 -pedantic -fvisibility=hidden

Tachyon

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.99b6Total TimeAOCC 3.0AOCC 3.1714212835SE +/- 0.20, N = 3SE +/- 0.03, N = 328.8928.661. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread

Google SynthMark

Test: VoiceMark_100

OpenBenchmarking.orgVoices, More Is BetterGoogle SynthMark 20201109Test: VoiceMark_100AOCC 3.0AOCC 3.1130260390520650SE +/- 0.20, N = 3SE +/- 0.49, N = 3622.23621.631. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSAOCC 3.0AOCC 3.160K120K180K240K300KSE +/- 1063.58, N = 3SE +/- 250.58, N = 32754882799311. (CC) gcc options: -pedantic -O3

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57AOCC 3.0AOCC 3.1200M400M600M800M1000MSE +/- 710969.60, N = 3SE +/- 596852.86, N = 37974766678065100001. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57AOCC 3.0AOCC 3.1300M600M900M1200M1500MSE +/- 696020.43, N = 3SE +/- 635959.47, N = 3150763333315279333331. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 64 - Buffer Length: 256 - Filter Length: 57AOCC 3.0AOCC 3.1400M800M1200M1600M2000MSE +/- 520683.31, N = 3SE +/- 983756.97, N = 3189183333318949666671. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

FinanceBench

Benchmark: Repo OpenMP

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Repo OpenMPAOCC 3.0AOCC 3.17K14K21K28K35KSE +/- 124.42, N = 3SE +/- 308.02, N = 333706.0233763.581. (CXX) g++ options: -O3 -march=native -fopenmp

FinanceBench

Benchmark: Bonds OpenMP

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Bonds OpenMPAOCC 3.0AOCC 3.111K22K33K44K55KSE +/- 198.20, N = 3SE +/- 150.62, N = 351304.1551185.491. (CXX) g++ options: -O3 -march=native -fopenmp

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.1.0Test: Decompression ThroughputAOCC 3.0AOCC 3.150100150200250SE +/- 0.39, N = 3SE +/- 0.05, N = 3222.01220.671. (CC) gcc options: -O3 -march=znver3 -rdynamic

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.0Preset: ExhaustiveAOCC 3.0AOCC 3.1510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 321.9521.891. (CXX) g++ options: -O3 -march=znver3 -flto -pthread

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000AOCC 3.11224364860SE +/- 0.06, N = 355.241. (CC) gcc options: -O3 -march=znver3 -ldl -lz -lpthread

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.4.1Model: LionAOCC 3.0AOCC 3.111002200330044005500SE +/- 3.84, N = 3SE +/- 25.93, N = 3530349601. (CXX) g++ options: -O3 -march=znver3

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.4.1Model: Church FacadeAOCC 3.0AOCC 3.115003000450060007500SE +/- 39.87, N = 3SE +/- 49.64, N = 14704464881. (CXX) g++ options: -O3 -march=znver3

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU-v2-v2 - Model: mobilenet-v2AOCC 3.0AOCC 3.11.30282.60563.90845.21126.514SE +/- 0.05, N = 15SE +/- 0.03, N = 155.795.74MIN: 5.47 / MAX: 7.26MIN: 5.3 / MAX: 32.981. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU-v3-v3 - Model: mobilenet-v3AOCC 3.0AOCC 3.11.28932.57863.86795.15726.4465SE +/- 0.11, N = 15SE +/- 0.04, N = 155.735.60MIN: 4.91 / MAX: 48.44MIN: 4.66 / MAX: 56.31. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU - Model: vgg16AOCC 3.0AOCC 3.11428425670SE +/- 2.57, N = 15SE +/- 1.18, N = 1561.7951.60MIN: 44.7 / MAX: 201.09MIN: 29.8 / MAX: 198.511. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU - Model: resnet18AOCC 3.0AOCC 3.13691215SE +/- 0.12, N = 15SE +/- 0.09, N = 1512.2111.88MIN: 11.3 / MAX: 22.59MIN: 10.11 / MAX: 110.241. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lomp -lpthread -pthread

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2AOCC 3.0AOCC 3.11428425670SE +/- 0.19, N = 3SE +/- 0.03, N = 359.5160.66MIN: 59.13 / MAX: 60.16MIN: 60.48 / MAX: 60.971. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1AOCC 3.0AOCC 3.160120180240300SE +/- 0.07, N = 3SE +/- 0.08, N = 3262.36266.24MIN: 262.07 / MAX: 262.65MIN: 265.84 / MAX: 266.541. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

Facebook RocksDB

Test: Update Random

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Update RandomAOCC 3.0AOCC 3.1200K400K600K800K1000KSE +/- 3760.31, N = 3SE +/- 1446.17, N = 38201888326511. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -latomic -lpthread

Facebook RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Read While WritingAOCC 3.0AOCC 3.11.3M2.6M3.9M5.2M6.5MSE +/- 39263.64, N = 15SE +/- 55702.89, N = 3604818661355601. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -latomic -lpthread

Facebook RocksDB

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Read Random Write RandomAOCC 3.0AOCC 3.1700K1400K2100K2800K3500KSE +/- 4965.23, N = 3SE +/- 32512.94, N = 4297082930590681. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -latomic -lpthread

ONNX Runtime

Model: bertsquad-10 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: bertsquad-10 - Device: OpenMP CPUAOCC 3.0AOCC 3.1110220330440550SE +/- 6.29, N = 3SE +/- 4.81, N = 54744871. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: fcn-resnet101-11 - Device: OpenMP CPUAOCC 3.0AOCC 3.120406080100SE +/- 0.44, N = 3SE +/- 1.74, N = 1186901. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: super-resolution-10 - Device: OpenMP CPUAOCC 3.0AOCC 3.110002000300040005000SE +/- 11.37, N = 3SE +/- 49.29, N = 5475945681. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

WavPack Audio Encoding

WAV To WavPack

OpenBenchmarking.orgSeconds, Fewer Is BetterWavPack Audio Encoding 5.3WAV To WavPackAOCC 3.0AOCC 3.13691215SE +/- 0.00, N = 5SE +/- 0.00, N = 513.1613.181. (CXX) g++ options: -O3 -march=znver3 -rdynamic

GnuPG

2.7GB Sample File Encryption

OpenBenchmarking.orgSeconds, Fewer Is BetterGnuPG 2.2.272.7GB Sample File EncryptionAOCC 3.0AOCC 3.11530456075SE +/- 0.37, N = 3SE +/- 0.24, N = 367.7467.611. (CC) gcc options: -O3 -march=znver3


Phoronix Test Suite v10.8.4