xeon auggy

Tests for a future article. 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 22.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2308065-NE-XEONAUGGY78&grr&sro.

xeon auggyProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen Resolutionab2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads)Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS)Intel Ice Lake IEH512GB7682GB INTEL SSDPF2KX076TZASPEEDVE2282 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFPUbuntu 22.106.2.0-rc5-phx-dodt (x86_64)GNOME Shell 43.0X Server 1.21.1.31.3.224GCC 12.2.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000389 Java Details- OpenJDK Runtime Environment (build 11.0.19+7-post-Ubuntu-0ubuntu122.10.1)Python Details- Python 3.10.7Security Details- dodt: Mitigation of DOITM + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

xeon auggybuild-gcc: Time To Compilecouchdb: 500 - 1000 - 30libxsmm: 128blender: Barbershop - CPU-Onlylibxsmm: 256ospray: particle_volume/pathtracer/real_timencnn: CPU - FastestDetncnn: CPU - vision_transformerncnn: CPU - regnety_400mncnn: CPU - squeezenet_ssdncnn: CPU - yolov4-tinyncnn: CPU - resnet50ncnn: CPU - alexnetncnn: CPU - resnet18ncnn: CPU - vgg16ncnn: CPU - googlenetncnn: CPU - blazefacencnn: CPU - efficientnet-b0ncnn: CPU - mnasnetncnn: CPU - shufflenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU - mobilenetvvenc: Bosphorus 4K - Fastospray: particle_volume/scivis/real_timez3: 2.smt2couchdb: 300 - 1000 - 30ospray: particle_volume/ao/real_timeblender: Classroom - CPU-Onlyvvenc: Bosphorus 4K - Fastersrsran: PUSCH Processor Benchmark, Throughput Totalcouchdb: 100 - 1000 - 30gpaw: Carbon Nanotubevvenc: Bosphorus 1080p - Fastospray: gravity_spheres_volume/dim_512/scivis/real_timeospray: gravity_spheres_volume/dim_512/ao/real_timeheffte: c2c - Stock - double - 512apache-iotdb: 500 - 1 - 500apache-iotdb: 500 - 1 - 500ospray: gravity_spheres_volume/dim_512/pathtracer/real_timeheffte: c2c - FFTW - double - 512blender: Fishy Cat - CPU-Onlyquantlib: stress-ng: Cloningstress-ng: Pthreadstress-ng: Zlibstress-ng: Vector Floating Pointstress-ng: Fused Multiply-Addstress-ng: Matrix 3D Mathstress-ng: Pipestress-ng: Wide Vector Mathstress-ng: Floating Pointstress-ng: AVL Treestress-ng: Vector Shuffleliquid-dsp: 16 - 256 - 512liquid-dsp: 16 - 256 - 57liquid-dsp: 16 - 256 - 32liquid-dsp: 1 - 256 - 512liquid-dsp: 1 - 256 - 57liquid-dsp: 1 - 256 - 32apache-iotdb: 100 - 100 - 500apache-iotdb: 100 - 100 - 500encode-opus: WAV To Opus Encodez3: 1.smt2apache-iotdb: 200 - 1 - 500apache-iotdb: 200 - 1 - 500srsran: Downlink Processor Benchmarkapache-iotdb: 500 - 1 - 200apache-iotdb: 500 - 1 - 200blender: BMW27 - CPU-Onlyvvenc: Bosphorus 1080p - Fasterliquid-dsp: 160 - 256 - 512liquid-dsp: 128 - 256 - 512liquid-dsp: 64 - 256 - 512liquid-dsp: 32 - 256 - 512liquid-dsp: 160 - 256 - 57liquid-dsp: 128 - 256 - 57liquid-dsp: 160 - 256 - 32liquid-dsp: 128 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 64 - 256 - 32liquid-dsp: 32 - 256 - 57liquid-dsp: 32 - 256 - 32heffte: r2c - FFTW - double - 512heffte: c2c - Stock - float - 512heffte: r2c - Stock - double - 512heffte: c2c - FFTW - float - 512apache-iotdb: 100 - 100 - 200apache-iotdb: 100 - 100 - 200embree: Pathtracer - Asian Dragon Objapache-iotdb: 100 - 1 - 500apache-iotdb: 100 - 1 - 500embree: Pathtracer ISPC - Asian Dragon Objsrsran: PUSCH Processor Benchmark, Throughput Threadapache-iotdb: 200 - 1 - 200apache-iotdb: 200 - 1 - 200apache-iotdb: 100 - 1 - 200apache-iotdb: 100 - 1 - 200dav1d: Chimera 1080p 10-bitdav1d: Chimera 1080pheffte: r2c - FFTW - float - 512heffte: r2c - Stock - float - 512oidn: RTLightmap.hdr.4096x4096 - CPU-Onlydav1d: Summer Nature 4Kremhos: Sample Remap Exampleembree: Pathtracer - Asian Dragonlibxsmm: 64libxsmm: 32embree: Pathtracer - Crownembree: Pathtracer ISPC - Asian Dragonembree: Pathtracer ISPC - Crownoidn: RT.hdr_alb_nrm.3840x2160 - CPU-Onlyoidn: RT.ldr_alb_nrm.3840x2160 - CPU-Onlyheffte: c2c - FFTW - double - 256dav1d: Summer Nature 1080pheffte: c2c - Stock - double - 256heffte: r2c - FFTW - double - 256heffte: c2c - FFTW - float - 256heffte: r2c - Stock - double - 256heffte: c2c - Stock - float - 256heffte: r2c - FFTW - float - 256heffte: r2c - Stock - float - 256heffte: c2c - Stock - double - 128heffte: c2c - FFTW - double - 128heffte: c2c - Stock - float - 128heffte: r2c - Stock - double - 128heffte: r2c - FFTW - double - 128heffte: c2c - FFTW - float - 128heffte: r2c - Stock - float - 128heffte: r2c - FFTW - float - 128ab957.9461090.4241055.3239.55599.8151.1389.6246.5038.2015.7824.7717.575.6510.3126.2717.064.4911.487.619.828.717.9116.065.67224.359287.998152.45624.63762.3510.2849800.594.83445.82415.70820.81121.205647.280133.381343156.5622.697749.436330.592622.916195.0392131.546879.86132479.08181083180.4712743.8140500166.812195391.4121134.81610.6948054.48201615000615105000493660000133230005391850032338000109.439562245.2236.73625.71336.741134736.54556.513.251199743.2223.6929.0771013200000949400000725840000400730000260230000025192000003390700000296110000020692000001805000000119770000099254000090.574593.334994.263794.834842.7934266143.8576.960835.99995259.6889.8447164.814.84904320.617.54638644.35476.82516.17170.906176.6301.47282.5312.19585.24231219.9633.272.0419104.414887.93063.043.0545.8509699.9746.663693.0098102.278101.938101.8222.215236.66669.481694.4544107.452117.006156.224159.344185.453199.103956.1271946.7239.03592.5151.27310.0145.5639.3716.1324.4818.155.449.6326.1916.724.6111.647.439.758.777.9415.905.71724.782787.17824.620762.5110.4309756.745.63615.72320.481820.945947.385632.751372429.5822.575248.721030.772607.613172.8190361.76880.22131100.25181314757.4212742.7049325396.972196242.2121133.02610.8348076.7819879000062353500049841000013291000539265003226700096.1843021501.436.72625.25136.821137612.61556.814.121141859.2523.6229.1761011200000945190000730310000396265000263645000024263500003381950000294515000020766500001825450000118550000099344500090.238892.256892.717394.344242.1934807016.8577.263336.16992909.6990.0132164.714.42920435.7718.04628202.55476.77516.50171.134174.1141.46282.6512.36585.13151216.0639.370.7905104.553987.93193.013.0346.4607699.0946.505292.816198.8315102.426104.345230.541236.11967.948391.9039107.952116.839148.177158.711182.03199.333OpenBenchmarking.org

Timed GCC Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed GCC Compilation 13.2Time To Compileab2004006008001000SE +/- 1.97, N = 2957.95956.13

Apache CouchDB

Bulk Size: 500 - Inserts: 1000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.3.2Bulk Size: 500 - Inserts: 1000 - Rounds: 30a20040060080010001090.421. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD

libxsmm

M N K: 128

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128ab400800120016002000SE +/- 54.55, N = 21055.31946.71. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Barbershop - Compute: CPU-Onlyab50100150200250SE +/- 0.32, N = 2SE +/- 1.32, N = 2239.55239.03

libxsmm

M N K: 256

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256ab130260390520650SE +/- 2.65, N = 2599.8592.51. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OSPRay

Benchmark: particle_volume/pathtracer/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/pathtracer/real_timeab306090120150SE +/- 0.63, N = 2151.14151.27

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: FastestDetab3691215SE +/- 0.05, N = 2SE +/- 0.28, N = 29.6210.01MIN: 9.35 / MAX: 10.52MIN: 9.4 / MAX: 59.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vision_transformerab1122334455SE +/- 2.46, N = 2SE +/- 1.19, N = 246.5045.56MIN: 42.6 / MAX: 72.28MIN: 43.24 / MAX: 70.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: regnety_400mab918273645SE +/- 0.86, N = 2SE +/- 1.10, N = 238.2039.37MIN: 36.18 / MAX: 62.76MIN: 37.07 / MAX: 103.971. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: squeezenet_ssdab48121620SE +/- 0.07, N = 2SE +/- 0.41, N = 215.7816.13MIN: 15.4 / MAX: 43.08MIN: 15.35 / MAX: 39.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: yolov4-tinyab612182430SE +/- 0.65, N = 2SE +/- 0.64, N = 224.7724.48MIN: 22.68 / MAX: 208.18MIN: 22.66 / MAX: 47.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet50ab48121620SE +/- 0.36, N = 2SE +/- 0.83, N = 217.5718.15MIN: 16.92 / MAX: 18.88MIN: 16.98 / MAX: 42.811. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: alexnetab1.27132.54263.81395.08526.3565SE +/- 0.43, N = 2SE +/- 0.22, N = 25.655.44MIN: 5.03 / MAX: 6.71MIN: 5.08 / MAX: 7.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet18ab3691215SE +/- 1.04, N = 2SE +/- 0.31, N = 210.319.63MIN: 9.03 / MAX: 33.3MIN: 9.16 / MAX: 26.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vgg16ab612182430SE +/- 0.84, N = 2SE +/- 0.34, N = 226.2726.19MIN: 24.05 / MAX: 301.35MIN: 24.19 / MAX: 341.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: googlenetab48121620SE +/- 1.05, N = 2SE +/- 0.37, N = 217.0616.72MIN: 15.5 / MAX: 66.12MIN: 15.67 / MAX: 100.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: blazefaceab1.03732.07463.11194.14925.1865SE +/- 0.09, N = 2SE +/- 0.01, N = 24.494.61MIN: 4.31 / MAX: 5.13MIN: 4.49 / MAX: 5.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: efficientnet-b0ab3691215SE +/- 0.23, N = 2SE +/- 0.26, N = 211.4811.64MIN: 10.9 / MAX: 56.34MIN: 10.85 / MAX: 37.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mnasnetab246810SE +/- 0.07, N = 2SE +/- 0.01, N = 27.617.43MIN: 7.33 / MAX: 43.29MIN: 7.16 / MAX: 15.371. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: shufflenet-v2ab3691215SE +/- 0.03, N = 2SE +/- 0.06, N = 29.829.75MIN: 9.6 / MAX: 12.61MIN: 9.56 / MAX: 13.681. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3 - Model: mobilenet-v3ab246810SE +/- 0.14, N = 2SE +/- 0.03, N = 28.718.77MIN: 8.43 / MAX: 9.8MIN: 8.59 / MAX: 32.781. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v2-v2 - Model: mobilenet-v2ab246810SE +/- 0.13, N = 2SE +/- 0.02, N = 27.917.94MIN: 7.68 / MAX: 9.6MIN: 7.81 / MAX: 10.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mobilenetab48121620SE +/- 0.82, N = 2SE +/- 0.28, N = 216.0615.90MIN: 14.92 / MAX: 25.43MIN: 15.2 / MAX: 39.561. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

VVenC

Video Input: Bosphorus 4K - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 4K - Video Preset: Fastab1.28632.57263.85895.14526.4315SE +/- 0.019, N = 2SE +/- 0.063, N = 25.6725.7171. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

OSPRay

Benchmark: particle_volume/scivis/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/scivis/real_timeab612182430SE +/- 0.03, N = 224.3624.78

Z3 Theorem Prover

SMT File: 2.smt2

OpenBenchmarking.orgSeconds, Fewer Is BetterZ3 Theorem Prover 4.12.1SMT File: 2.smt2ab20406080100SE +/- 0.02, N = 2SE +/- 0.05, N = 288.0087.181. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC

Apache CouchDB

Bulk Size: 300 - Inserts: 1000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.3.2Bulk Size: 300 - Inserts: 1000 - Rounds: 30a306090120150152.461. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD

OSPRay

Benchmark: particle_volume/ao/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/ao/real_timeab612182430SE +/- 0.09, N = 224.6424.62

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Classroom - Compute: CPU-Onlyab1428425670SE +/- 0.04, N = 2SE +/- 0.19, N = 262.3562.51

VVenC

Video Input: Bosphorus 4K - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 4K - Video Preset: Fasterab3691215SE +/- 0.03, N = 2SE +/- 0.10, N = 210.2810.431. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

srsRAN Project

Test: PUSCH Processor Benchmark, Throughput Total

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput Totalab2K4K6K8K10KSE +/- 47.35, N = 2SE +/- 33.95, N = 29800.59756.71. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

Apache CouchDB

Bulk Size: 100 - Inserts: 1000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.3.2Bulk Size: 100 - Inserts: 1000 - Rounds: 30a2040608010094.831. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon Nanotubeab1020304050SE +/- 0.02, N = 2SE +/- 0.03, N = 245.8245.641. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

VVenC

Video Input: Bosphorus 1080p - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 1080p - Video Preset: Fastab48121620SE +/- 0.06, N = 2SE +/- 0.04, N = 215.7115.721. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

OSPRay

Benchmark: gravity_spheres_volume/dim_512/scivis/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeab510152025SE +/- 0.05, N = 220.8120.48

OSPRay

Benchmark: gravity_spheres_volume/dim_512/ao/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/ao/real_timeab510152025SE +/- 0.22, N = 221.2120.95

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 512ab1122334455SE +/- 0.14, N = 2SE +/- 0.07, N = 247.2847.391. (CXX) g++ options: -O3

Apache IoTDB

Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500ab81624324033.3832.75MAX: 934.86MAX: 992.49

Apache IoTDB

Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500ab300K600K900K1200K1500K1343156.561372429.58

OSPRay

Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeab510152025SE +/- 0.00, N = 222.7022.58

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512ab1122334455SE +/- 0.29, N = 2SE +/- 0.39, N = 249.4448.721. (CXX) g++ options: -O3

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Fishy Cat - Compute: CPU-Onlyab714212835SE +/- 0.01, N = 2SE +/- 0.02, N = 230.5930.77

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.30ab6001200180024003000SE +/- 2.00, N = 2SE +/- 1.80, N = 22622.92607.61. (CXX) g++ options: -O3 -march=native -fPIE -pie

Stress-NG

Test: Cloning

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Cloningab3K6K9K12K15KSE +/- 3270.70, N = 2SE +/- 654.66, N = 216195.0313172.811. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Pthreadab20K40K60K80K100KSE +/- 279.15, N = 2SE +/- 894.90, N = 292131.5490361.701. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Zlib

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Zlibab15003000450060007500SE +/- 8.83, N = 2SE +/- 4.39, N = 26879.866880.221. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Vector Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Floating Pointab30K60K90K120K150KSE +/- 872.22, N = 2SE +/- 235.00, N = 2132479.08131100.251. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Fused Multiply-Add

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Fused Multiply-Addab40M80M120M160M200MSE +/- 118686.48, N = 2SE +/- 92010.25, N = 2181083180.47181314757.421. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Matrix 3D Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix 3D Mathab3K6K9K12K15KSE +/- 9.80, N = 2SE +/- 5.06, N = 212743.8112742.701. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Pipe

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Pipeab11M22M33M44M55MSE +/- 2523742.74, N = 2SE +/- 6369572.71, N = 240500166.8149325396.971. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Wide Vector Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Wide Vector Mathab500K1000K1500K2000K2500KSE +/- 1200.16, N = 2SE +/- 497.69, N = 22195391.412196242.211. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Floating Pointab5K10K15K20K25KSE +/- 6.86, N = 2SE +/- 9.14, N = 221134.8121133.021. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: AVL Tree

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: AVL Treeab130260390520650SE +/- 0.08, N = 2SE +/- 0.44, N = 2610.69610.831. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Vector Shuffle

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Shuffleab10K20K30K40K50KSE +/- 1.26, N = 2SE +/- 42.22, N = 248054.4848076.781. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 16 - Buffer Length: 256 - Filter Length: 512ab40M80M120M160M200MSE +/- 795000.00, N = 2SE +/- 650000.00, N = 22016150001987900001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 16 - Buffer Length: 256 - Filter Length: 57ab130M260M390M520M650MSE +/- 3155000.00, N = 2SE +/- 11305000.00, N = 26151050006235350001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 16 - Buffer Length: 256 - Filter Length: 32ab110M220M330M440M550MSE +/- 1500000.00, N = 2SE +/- 830000.00, N = 24936600004984100001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 512ab3M6M9M12M15MSE +/- 1000.00, N = 2SE +/- 34000.00, N = 213323000132910001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 57ab12M24M36M48M60MSE +/- 500.00, N = 2SE +/- 1500.00, N = 253918500539265001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 32ab7M14M21M28M35MSE +/- 0.00, N = 2SE +/- 0.00, N = 232338000322670001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500ab20406080100109.4096.18MAX: 2142.92MAX: 1249.92

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500ab9M18M27M36M45M39562245.2243021501.40

Opus Codec Encoding

WAV To Opus Encode

OpenBenchmarking.orgSeconds, Fewer Is BetterOpus Codec Encoding 1.4WAV To Opus Encodeab816243240SE +/- 0.01, N = 236.7436.731. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

Z3 Theorem Prover

SMT File: 1.smt2

OpenBenchmarking.orgSeconds, Fewer Is BetterZ3 Theorem Prover 4.12.1SMT File: 1.smt2ab612182430SE +/- 0.03, N = 2SE +/- 0.07, N = 225.7125.251. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500ab81624324036.7436.82MAX: 793.88MAX: 691.5

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500ab200K400K600K800K1000K1134736.541137612.61

srsRAN Project

Test: Downlink Processor Benchmark

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: Downlink Processor Benchmarkab120240360480600SE +/- 0.70, N = 2SE +/- 1.25, N = 2556.5556.81. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

Apache IoTDB

Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200ab4812162013.2514.12MAX: 896.77MAX: 878.17

Apache IoTDB

Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200ab300K600K900K1200K1500K1199743.221141859.25

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: BMW27 - Compute: CPU-Onlyab612182430SE +/- 0.09, N = 2SE +/- 0.06, N = 223.6923.62

VVenC

Video Input: Bosphorus 1080p - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 1080p - Video Preset: Fasterab714212835SE +/- 0.37, N = 2SE +/- 0.23, N = 229.0829.181. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

Liquid-DSP

Threads: 160 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 160 - Buffer Length: 256 - Filter Length: 512ab200M400M600M800M1000MSE +/- 1800000.00, N = 2101320000010112000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 512ab200M400M600M800M1000MSE +/- 2180000.00, N = 29494000009451900001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512ab160M320M480M640M800MSE +/- 1250000.00, N = 27258400007303100001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 512ab90M180M270M360M450MSE +/- 2775000.00, N = 24007300003962650001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 160 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 160 - Buffer Length: 256 - Filter Length: 57ab600M1200M1800M2400M3000MSE +/- 17250000.00, N = 2260230000026364500001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 57ab500M1000M1500M2000M2500MSE +/- 13350000.00, N = 2251920000024263500001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 160 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 160 - Buffer Length: 256 - Filter Length: 32ab700M1400M2100M2800M3500MSE +/- 9150000.00, N = 2339070000033819500001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 32ab600M1200M1800M2400M3000MSE +/- 6350000.00, N = 2296110000029451500001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57ab400M800M1200M1600M2000MSE +/- 13650000.00, N = 2206920000020766500001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32ab400M800M1200M1600M2000MSE +/- 2750000.00, N = 2180500000018254500001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 57ab300M600M900M1200M1500MSE +/- 20600000.00, N = 2119770000011855000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 32ab200M400M600M800M1000MSE +/- 2085000.00, N = 29925400009934450001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512ab20406080100SE +/- 1.08, N = 2SE +/- 1.17, N = 290.5790.241. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 512ab20406080100SE +/- 0.40, N = 2SE +/- 0.24, N = 293.3392.261. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 512ab20406080100SE +/- 0.13, N = 2SE +/- 0.73, N = 294.2692.721. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512ab20406080100SE +/- 1.11, N = 2SE +/- 0.95, N = 294.8394.341. (CXX) g++ options: -O3

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200ab102030405042.7942.19MAX: 855.16MAX: 784.56

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200ab7M14M21M28M35M34266143.8534807016.85

Embree

Binary: Pathtracer - Model: Asian Dragon Obj

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: Asian Dragon Objab20406080100SE +/- 0.03, N = 2SE +/- 0.03, N = 276.9677.26MIN: 75.53 / MAX: 82.14MIN: 75.78 / MAX: 81.08

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500ab81624324035.9936.16MAX: 724.8MAX: 769.5

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500ab200K400K600K800K1000K995259.68992909.69

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon Obj

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragon Objab20406080100SE +/- 0.06, N = 2SE +/- 0.05, N = 289.8490.01MIN: 87.68 / MAX: 94.71MIN: 87.6 / MAX: 94.43

srsRAN Project

Test: PUSCH Processor Benchmark, Throughput Thread

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput Threadab4080120160200SE +/- 1.70, N = 2SE +/- 0.90, N = 2164.8164.71. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200ab4812162014.8414.42MAX: 605.55MAX: 596.84

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200ab200K400K600K800K1000K904320.60920435.77

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200ab4812162017.5418.04MAX: 680.16MAX: 597.99

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200ab140K280K420K560K700K638644.35628202.55

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.2.1Video Input: Chimera 1080p 10-bitab100200300400500SE +/- 0.41, N = 2476.82476.771. (CC) gcc options: -pthread -lm

dav1d

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.2.1Video Input: Chimera 1080pab110220330440550SE +/- 0.06, N = 2516.17516.501. (CC) gcc options: -pthread -lm

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512ab4080120160200SE +/- 1.25, N = 2SE +/- 1.39, N = 2170.91171.131. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 512ab4080120160200SE +/- 0.68, N = 2SE +/- 0.12, N = 2176.63174.111. (CXX) g++ options: -O3

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.0Run: RTLightmap.hdr.4096x4096 - Device: CPU-Onlyab0.33080.66160.99241.32321.654SE +/- 0.00, N = 21.471.46

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.2.1Video Input: Summer Nature 4Kab60120180240300SE +/- 0.08, N = 2282.53282.651. (CC) gcc options: -pthread -lm

Remhos

Test: Sample Remap Example

OpenBenchmarking.orgSeconds, Fewer Is BetterRemhos 1.0Test: Sample Remap Exampleab3691215SE +/- 0.10, N = 212.2012.371. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Embree

Binary: Pathtracer - Model: Asian Dragon

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: Asian Dragonab20406080100SE +/- 0.04, N = 2SE +/- 0.14, N = 285.2485.13MIN: 83.75 / MAX: 89.99MIN: 83.65 / MAX: 90.45

libxsmm

M N K: 64

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64ab30060090012001500SE +/- 1.25, N = 21219.91216.01. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

libxsmm

M N K: 32

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32ab140280420560700SE +/- 2.35, N = 2633.2639.31. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Embree

Binary: Pathtracer - Model: Crown

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: Crownab1632486480SE +/- 0.12, N = 272.0470.79MIN: 68.2 / MAX: 79.55MIN: 67 / MAX: 79.71

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragonab20406080100SE +/- 0.32, N = 2SE +/- 0.24, N = 2104.41104.55MIN: 101.88 / MAX: 109.22MIN: 102.2 / MAX: 108.91

Embree

Binary: Pathtracer ISPC - Model: Crown

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Crownab20406080100SE +/- 0.10, N = 287.9387.93MIN: 85.27 / MAX: 92.58MIN: 84.73 / MAX: 92.37

Intel Open Image Denoise

Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.0Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Onlyab0.6841.3682.0522.7363.42SE +/- 0.00, N = 23.043.01

Intel Open Image Denoise

Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.0Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Onlyab0.68631.37262.05892.74523.4315SE +/- 0.01, N = 23.053.03

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256ab1122334455SE +/- 0.55, N = 245.8546.461. (CXX) g++ options: -O3

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.2.1Video Input: Summer Nature 1080pab150300450600750SE +/- 0.50, N = 2699.97699.091. (CC) gcc options: -pthread -lm

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 256ab1122334455SE +/- 0.08, N = 246.6646.511. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256ab20406080100SE +/- 1.52, N = 293.0192.821. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256ab20406080100SE +/- 0.00, N = 2102.2898.831. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 256ab20406080100SE +/- 0.72, N = 2101.94102.431. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 256ab20406080100SE +/- 0.34, N = 2101.80104.351. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256ab50100150200250SE +/- 3.31, N = 2222.22230.541. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 256ab50100150200250SE +/- 2.75, N = 2236.67236.121. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 128ab1530456075SE +/- 1.01, N = 269.4867.951. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128ab20406080100SE +/- 1.46, N = 294.4591.901. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 128ab20406080100SE +/- 1.54, N = 2107.45107.951. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 128ab306090120150SE +/- 4.04, N = 2117.01116.841. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128ab306090120150SE +/- 2.45, N = 2156.22148.181. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128ab4080120160200SE +/- 0.21, N = 2159.34158.711. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 128ab4080120160200SE +/- 0.86, N = 2185.45182.031. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128ab4080120160200SE +/- 0.39, N = 2199.10199.331. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.5