clang-load-hoisting-benchmark

Intel Core i7-10750H testing with a CML Azalea_FMS (V1.03 BIOS) and NVIDIA GeForce GTX 1650 Ti 4GB on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2112028-TJ-CLANGLOAD40
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts

Limit displaying results to tests within:

Bioinformatics 3 Tests
Chess Test Suite 2 Tests
C/C++ Compiler Tests 20 Tests
Compression Tests 2 Tests
CPU Massive 21 Tests
Creator Workloads 11 Tests
Encoding 6 Tests
HPC - High Performance Computing 3 Tests
Imaging 2 Tests
Multi-Core 17 Tests
OpenMPI Tests 2 Tests
Programmer / Developer System Benchmarks 2 Tests
Renderers 3 Tests
Scientific Computing 3 Tests
Server 2 Tests
Server CPU Tests 12 Tests
Single-Threaded 4 Tests
Video Encoding 6 Tests
Common Workstation Benchmarks 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
clang-without-load-hoisting
December 02 2021
  1 Hour, 12 Minutes
clang-with-load-hoisting
December 02 2021
  1 Hour, 9 Minutes
Invert Hiding All Results Option
  1 Hour, 10 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


clang-load-hoisting-benchmarkOpenBenchmarking.orgPhoronix Test SuiteIntel Core i7-10750H @ 5.00GHz (6 Cores / 12 Threads)CML Azalea_FMS (V1.03 BIOS)Intel Comet Lake PCH16GB1024GB Micron_2210_MTFDHBA1T0QFDNVIDIA GeForce GTX 1650 Ti 4GBRealtek ALC255Realtek RTL8111/8168/8411 + Intel Wi-Fi 6 AX201Ubuntu 20.045.11.0-41-generic (x86_64)GNOME Shell 3.36.9X Server 1.20.11NVIDIA 460.91.034.6.01.2.145Clang 14.0.0 + GCC 9.3.0ext41920x1080ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionClang-load-hoisting-benchmark PerformanceSystem Logs- Transparent Huge Pages: madvise- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"- Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: skylake - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xea - Thermald 1.9.1 - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

clang-without-load-hoisting vs. clang-with-load-hoisting ComparisonPhoronix Test SuiteBaseline+3.2%+3.2%+6.4%+6.4%+9.6%+9.6%12.8%12%10.7%9.3%8.5%7.6%6.9%6.6%6.3%5.1%4.9%4.8%3.9%3.3%2.4%2.2%1 - 1 - Read Write - Average Latency1 - 1 - Read WriteGracefulS.AD.TP.P.ACompositeTotal Time - 4.1.R.P.PC.u.1.0.3.s.i.i.C.L.9Time To CompileHair2048 x 2048 - Total TimeCoreMark Size 666 - I.P.S3.7%1.H.M.2.DH.2.V.EMD5PostgreSQL pgbenchPostgreSQL pgbenchMinionCppPerformanceBenchmarkslibjpeg-turbo tjbenchTimed MrBayes AnalysisSciMarkC-RayXZ CompressionTimed PHP CompilationebizzyTungsten RendererAOBenchCoremarkasmFishx264John The Ripperclang-without-load-hoistingclang-with-load-hoisting

clang-load-hoisting-benchmarkdav1d: Summer Nature 1080psvt-hevc: 7 - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080pvpxenc: Speed 5 - Bosphorus 1080px264: H.264 Video Encodingx265: Bosphorus 1080pgraphics-magick: HWB Color Spacecoremark: CoreMark Size 666 - Iterations Per Secondcompress-zstd: 3 - Compression Speedcompress-zstd: 3 - Decompression Speedtjbench: Decompression Throughputscimark2: Compositehimeno: Poisson Pressure Solvertscp: AI Chess Performanceasmfish: 1024 Hash Memory, 26 Depthjohn-the-ripper: MD5ebizzy: pgbench: 1 - 1 - Read Writepgbench: 1 - 1 - Read Write - Average Latencymrbayes: Primate Phylogeny Analysisbuild-php: Time To Compilec-ray: Total Time - 4K, 16 Rays Per Pixeltungsten: Hairaobench: 2048 x 2048 - Total Timecompress-xz: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9minion: Gracefulcpp-perf-bench: Stepanov Abstractionclang-without-load-hoistingclang-with-load-hoisting391.9270.1591.5325.8059.0641.42188194017.7299471179.12768.5180.0044652681.293502.6325681334249172758788918229981012990.775134.71764.900111.49643.489932.59854.60954.14766332.727398.9270.4391.2626.0060.4741.77191187025.8547251193.42770.8195.2253572866.493504.4762841334897178502309116331464614550.687125.19361.759104.55641.480131.38851.37948.90829129.949OpenBenchmarking.org

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Summer Nature 1080pclang-with-load-hoistingclang-without-load-hoisting90180270360450SE +/- 0.98, N = 3SE +/- 2.82, N = 3398.92391.92MIN: 364.8 / MAX: 431.63MIN: 350.25 / MAX: 427.871. (CC) clang options: -Qunused-arguments -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm -lxcb -lasound -lSDL2 -lsndio -lXv -lX11 -lXext -pthread -lbz2 -O3 -march=native -std=c11 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -mstack-alignment=16 -MMD -MF -MT

SVT-HEVC

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pclang-with-load-hoistingclang-without-load-hoisting1632486480SE +/- 0.22, N = 3SE +/- 0.17, N = 370.4370.15

SVT-VP9

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample YUV input video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 1080pclang-with-load-hoistingclang-without-load-hoisting20406080100SE +/- 1.49, N = 3SE +/- 1.14, N = 391.2691.53

VP9 libvpx Encoding

This is a standard video encoding performance test of Google's libvpx library and the vpxenc command for the VP9 video format. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 1080pclang-with-load-hoistingclang-without-load-hoisting612182430SE +/- 0.02, N = 3SE +/- 0.04, N = 326.0025.801. (CC) clang options: -m64 -lpthread

x264

This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2019-12-17H.264 Video Encodingclang-with-load-hoistingclang-without-load-hoisting1428425670SE +/- 0.68, N = 3SE +/- 0.85, N = 360.4759.06

x265

This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pclang-with-load-hoistingclang-without-load-hoisting1020304050SE +/- 0.17, N = 3SE +/- 0.45, N = 341.7741.42

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color Spaceclang-with-load-hoistingclang-without-load-hoisting4080120160200SE +/- 0.58, N = 3191188

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondclang-with-load-hoistingclang-without-load-hoisting40K80K120K160K200KSE +/- 4680.12, N = 3SE +/- 576.77, N = 3187025.85194017.731. (CC) clang options: -O2 -O3 -march=native -lrt" -lrt

Zstd Compression

This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Compression Speedclang-with-load-hoistingclang-without-load-hoisting30060090012001500SE +/- 10.58, N = 3SE +/- 12.07, N = 31193.41179.1

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Decompression Speedclang-with-load-hoistingclang-without-load-hoisting6001200180024003000SE +/- 2.67, N = 3SE +/- 2.43, N = 32770.82768.5

libjpeg-turbo tjbench

tjbench is a JPEG decompression/compression benchmark that is part of libjpeg-turbo, a JPEG image codec library optimized for SIMD instructions on modern CPU architectures. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.1.0Test: Decompression Throughputclang-with-load-hoistingclang-without-load-hoisting4080120160200SE +/- 0.35, N = 3SE +/- 0.41, N = 3195.23180.00

SciMark

This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Compositeclang-with-load-hoistingclang-without-load-hoisting6001200180024003000SE +/- 136.09, N = 3SE +/- 14.91, N = 32866.492681.291. (CC) clang options: -O3 -march=native -lm

Himeno Benchmark

The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solverclang-with-load-hoistingclang-without-load-hoisting8001600240032004000SE +/- 1.69, N = 3SE +/- 5.98, N = 33504.483502.631. (CC) clang options: -O3 -march=native -mavx2

TSCP

This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess Performanceclang-with-load-hoistingclang-without-load-hoisting300K600K900K1200K1500KSE +/- 792.90, N = 5SE +/- 647.40, N = 5133489713342491. (CC) clang options: -O3 -march=native

asmFish

This is a test of asmFish, an advanced chess benchmark written in Assembly. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes/second, More Is BetterasmFish 2018-07-231024 Hash Memory, 26 Depthclang-with-load-hoistingclang-without-load-hoisting4M8M12M16M20MSE +/- 27684.37, N = 3SE +/- 292729.63, N = 31785023017275878

John The Ripper

This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: MD5clang-with-load-hoistingclang-without-load-hoisting20K40K60K80K100KSE +/- 652.06, N = 3SE +/- 712.43, N = 39116389182

ebizzy

This is a test of ebizzy, a program to generate workloads resembling web server workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRecords/s, More Is Betterebizzy 0.3clang-with-load-hoistingclang-without-load-hoisting70K140K210K280K350KSE +/- 8892.96, N = 3SE +/- 8335.76, N = 33146462998101. (CC) clang options: -pthread -lpthread -O3 -march=native

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 14.0Scaling Factor: 1 - Clients: 1 - Mode: Read Writeclang-with-load-hoistingclang-without-load-hoisting30060090012001500SE +/- 14.48, N = 3SE +/- 74.13, N = 314551299

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 14.0Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average Latencyclang-with-load-hoistingclang-without-load-hoisting0.17440.34880.52320.69760.872SE +/- 0.007, N = 3SE +/- 0.042, N = 30.6870.775

Timed MrBayes Analysis

This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny Analysisclang-with-load-hoistingclang-without-load-hoisting306090120150SE +/- 0.41, N = 3SE +/- 1.63, N = 3125.19134.72

Timed HMMer Search

This test searches through the Pfam database of profile hidden markov models. The search finds the domain structure of Drosophila Sevenless protein. Learn more via the OpenBenchmarking.org test page.

clang-without-load-hoisting: The test quit with a non-zero exit status.

clang-with-load-hoisting: The test quit with a non-zero exit status.

Timed PHP Compilation

This test times how long it takes to build PHP 7. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed PHP Compilation 7.4.2Time To Compileclang-with-load-hoistingclang-without-load-hoisting1428425670SE +/- 0.08, N = 3SE +/- 0.84, N = 361.7664.90

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per Pixelclang-with-load-hoistingclang-without-load-hoisting20406080100SE +/- 0.35, N = 3SE +/- 0.57, N = 3104.56111.501. (CC) clang options: -lm -lpthread -O3 -march=native

Tungsten Renderer

Tungsten is a C++ physically based renderer that makes use of Intel's Embree ray tracing library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Hairclang-with-load-hoistingclang-without-load-hoisting1020304050SE +/- 0.02, N = 3SE +/- 0.31, N = 341.4843.49

AOBench

AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total Timeclang-with-load-hoistingclang-without-load-hoisting816243240SE +/- 1.87, N = 3SE +/- 1.59, N = 331.3932.601. (CC) clang options: -lm -O3 -march=native

XZ Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXZ Compression 5.2.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9clang-with-load-hoistingclang-without-load-hoisting1224364860SE +/- 0.04, N = 3SE +/- 0.91, N = 351.3854.61

Minion

Minion is an open-source constraint solver that is designed to be very scalable. This test profile uses Minion's integrated benchmarking problems to solve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterMinion 1.8Benchmark: Gracefulclang-with-load-hoistingclang-without-load-hoisting1224364860SE +/- 1.42, N = 3SE +/- 0.53, N = 348.9154.15

CppPerformanceBenchmarks

CppPerformanceBenchmarks is a set of C++ compiler performance benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov Abstractionclang-with-load-hoistingclang-without-load-hoisting816243240SE +/- 0.36, N = 3SE +/- 0.45, N = 329.9532.73