AMD EPYC 9755 DDR5 Turin Memory Performance

AMD EPYC 9755 with varying DDR5-6000 default versus DDR5-4800 memory performance. Benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2410130-NE-TURINDDR565&gru&sor.

AMD EPYC 9755 DDR5 Turin Memory PerformanceProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionDDR5-4800DDR5-6000AMD EPYC 9755 128-Core @ 2.70GHz (128 Cores / 256 Threads)AMD VOLCANO (RVOT1000D BIOS)AMD Device 153a12 x 64GB DDR5-4800MT/s Samsung M321R8GA0PB1-CCPKC2 x 1920GB KIOXIA KCD8XPUG1T92ASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 24.046.10.0-phx (x86_64)GCC 13.2.0ext41920x120012 x 64GB DDR5-6000MT/s Samsung M321R8GA0PB1-CCPKCOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002110Java Details- OpenJDK Runtime Environment (build 21.0.3-ea+7-Ubuntu-1build1)Python Details- Python 3.12.2Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AMD EPYC 9755 DDR5 Turin Memory Performancepytorch: CPU - 1 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 64 - ResNet-50pytorch: CPU - 64 - ResNet-152pytorch: CPU - 256 - ResNet-50pytorch: CPU - 256 - ResNet-152pytorch: CPU - 512 - ResNet-50pytorch: CPU - 512 - ResNet-152minibude: OpenMP - BM1minibude: OpenMP - BM2openssl: SHA256openssl: SHA512openssl: AES-128-GCMopenssl: AES-256-GCMopenssl: ChaCha20openssl: ChaCha20-Poly1305minife: Smallamg: ffmpeg: libx265 - Uploadffmpeg: libx265 - Platformffmpeg: libx265 - Video On Demandopenvino: Face Detection FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Person Detection FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUembree: Pathtracer ISPC - Asian Dragonembree: Pathtracer ISPC - Asian Dragon Objembree: Pathtracer ISPC - Crownsvt-av1: Preset 13 - Bosphorus 4Ksvt-av1: Preset 12 - Bosphorus 4Ksvt-av1: Preset 8 - Bosphorus 4Ksvt-av1: Preset 4 - Bosphorus 4Kx265: Bosphorus 4Kkvazaar: Bosphorus 4K - Slowkvazaar: Bosphorus 4K - Mediumkvazaar: Bosphorus 4K - Very Fastuvg266: Bosphorus 4K - Slowuvg266: Bosphorus 4K - Mediumuvg266: Bosphorus 4K - Very Fastvvenc: Bosphorus 4K - Fastvvenc: Bosphorus 4K - Fasterminibude: OpenMP - BM1minibude: OpenMP - BM2mt-dgemm: Sustained Floating-Point Ratehpcg: 144 144 144 - 60xmrig: GhostRider - 1Moidn: RT.hdr_alb_nrm.3840x2160 - CPU-Onlyoidn: RT.ldr_alb_nrm.3840x2160 - CPU-Onlyoidn: RTLightmap.hdr.4096x4096 - CPU-Onlytensorflow: CPU - 1 - ResNet-50tensorflow: CPU - 1 - AlexNettensorflow: CPU - 1 - GoogLeNettensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 64 - AlexNettensorflow: CPU - 64 - GoogLeNettensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 256 - AlexNettensorflow: CPU - 256 - GoogLeNettensorflow: CPU - 512 - ResNet-50tensorflow: CPU - 512 - AlexNettensorflow: CPU - 512 - GoogLeNetonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Parallelonnx: super-resolution-10 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: T5 Encoder - CPU - Standardopenvkl: vklBenchmarkCPU ISPCospray: gravity_spheres_volume/dim_512/pathtracer/real_timeospray: particle_volume/ao/real_timeospray: particle_volume/scivis/real_timegraphics-magick: Noise-Gaussiangraphics-magick: Enhancedgraphics-magick: Sharpengraphics-magick: Swirlcoremark: CoreMark Size 666 - Iterations Per Secondluxcorerender: DLSC - CPUluxcorerender: LuxCore Benchmark - CPUluxcorerender: Orange Juice - CPUsecuremark: SecureMark-TLScompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedsrsran: PUSCH Processor Benchmark, Throughput Totalsrsran: PUSCH Processor Benchmark, Throughput Threadsrsran: PDSCH Processor Benchmark, Throughput Totalsrsran: PDSCH Processor Benchmark, Throughput Threadquantlib: Single-Threadedquantlib: Multi-Threadedaskap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingcompress-7zip: Compression Ratingcompress-7zip: Decompression Ratingwebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressionaskap: tConvolve MPI - Degriddingaskap: tConvolve MPI - Griddingastcenc: Mediumastcenc: Thoroughastcenc: Very Thoroughastcenc: Exhaustivestockfish: Chess Benchmarkgromacs: MPI CPU - water_GMX50_barelammps: Rhodopsin Proteinlammps: 20k Atomsnamd: ATPase with 327,506 Atomsnamd: STMV with 1,066,628 Atomsrocksdb: Rand Readrocksdb: Read While Writingspeedb: Rand Readspeedb: Read While Writingmemcached: 1:100apache-iotdb: 500 - 100 - 800 - 400apache-iotdb: 800 - 100 - 800 - 100clickhouse: 100M Rows Hits Dataset, First Run / Cold Cacheclickhouse: 100M Rows Hits Dataset, Second Runclickhouse: 100M Rows Hits Dataset, Third Runjohn-the-ripper: Blowfishjohn-the-ripper: bcryptjohn-the-ripper: WPA PSKliquid-dsp: 1 - 256 - 32liquid-dsp: 1 - 256 - 57liquid-dsp: 1 - 256 - 512liquid-dsp: 64 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 64 - 256 - 512liquid-dsp: 128 - 256 - 32liquid-dsp: 128 - 256 - 57liquid-dsp: 128 - 256 - 512liquid-dsp: 256 - 256 - 32liquid-dsp: 256 - 256 - 57liquid-dsp: 256 - 256 - 512numpy: openssl: RSA4096llamafile: llava-v1.5-7b-q4 - CPUllamafile: wizardcoder-python-34b-v1.0.Q6_K - CPUnpb: LU.Cnpb: SP.Cnpb: IS.Dnpb: MG.Cnpb: CG.Cpgbench: 100 - 1000 - Read Writepgbench: 100 - 1000 - Read Onlyopenssl: RSA4096brl-cad: VGR Performance Metriclulesh: apache-iotdb: 500 - 100 - 800 - 400apache-iotdb: 800 - 100 - 800 - 100pennant: leblancbigonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Parallelonnx: super-resolution-10 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: T5 Encoder - CPU - Standardpybench: Total For Average Test Timespgbench: 100 - 1000 - Read Write - Average Latencypgbench: 100 - 1000 - Read Only - Average Latencyospray-studio: 1 - 4K - 1 - Path Tracer - CPUospray-studio: 1 - 4K - 16 - Path Tracer - CPUospray-studio: 1 - 4K - 32 - Path Tracer - CPUospray-studio: 3 - 4K - 1 - Path Tracer - CPUospray-studio: 3 - 4K - 16 - Path Tracer - CPUospray-studio: 3 - 4K - 32 - Path Tracer - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Person Detection FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUonednn: Deconvolution Batch shapes_3d - CPUbuild-linux-kernel: defconfigbuild-linux-kernel: allmodconfigbuild-ffmpeg: Time To Compilebuild-godot: Time To Compilebuild-nodejs: Time To Compilebuild-gem5: Time To Compilebuild-llvm: Ninjabuild-mesa: Time To Compilebuild-imagemagick: Time To Compilewrf: conus 2.5kmm-queens: Time To Solvecloverleaf: clover_bm64_shortcloverleaf: clover_bm16gpaw: Carbon Nanotubenwchem: C240 Buckyballincompact3d: input.i3d 193 Cells Per Directionincompact3d: X3D-benchmarking input.i3dopenfoam: drivaerFastback, Small Mesh Size - Mesh Timeopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Execution Timeopenradioss: Cell Phone Drop Testopenradioss: INIVOL and Fluid Structure Interaction Drop Containeropenradioss: Chrysler Neon 1Mblender: BMW27 - CPU-Onlyblender: Classroom - CPU-Onlyblender: Fishy Cat - CPU-Onlyblender: Pabellon Barcelona - CPU-Onlyblender: Barbershop - CPU-Onlyblender: Junkshop - CPU-Onlyappleseed: Disney Materialcompress-pbzip2: FreeBSD-13.0-RELEASE-amd64-memstick.img Compressionavifenc: 0avifenc: 2avifenc: 6avifenc: 6, Losslessavifenc: 10, Losslessoctave-benchmark: helsing: 14 digitprimesieve: 1e12primesieve: 1e13y-cruncher: 500My-cruncher: 1Bqmcpack: Li2_STO_aeDDR5-4800DDR5-600051.6120.4743.0717.7743.2417.5842.8417.53383.901386.2031867898199537293623575320033130987531839699436523118731514793780758842563759008.4269847600032.5767.1467.05194.14192115.85674.4618679.6311196.678681.621043.3131630.964776.663224.7213918.749833.47219.9997188.9994178.5158265.847274.068117.85411.25534.0853.4353.9994.7139.2543.9676.359.71922.8839597.5299655.07555.87383153.735718386.45.725.732.776.9626.1222.94134.501071.51355.14195.892223.51641.12231.712644.59780.7312.089114.41683.23367195.16824.5868192.35243.631641.3655957.064234.140361454.960354.376554.57143144574068656019991.22113621.2013.9330.8340775921.41721.510.21627.78056.3183.319592.6990.44252.2464193.214143.322665.18829368456313.880.6766651.468467.4697.1942110.034715.71299.657630869258122.50855.15068.47914.036474.54707793390385168171118185567681882643113455536.72118270441133128010680.81693.77704.363223963224061360000440630006832133323164333278313333332989333331446266667542413333358348000002484533333846310000081961666672855600000796.7769052.529.549.71383603.91172595.777995.28149191.0388719.5312731053645772757741.0584838035386.935225.7057.582.85652882.715369.3647309.3425.1225540.66945.1967222.938424.17461.043314.270075827.8550.18764210200204287581206324131328.940.4894.776.615.637.2761.243.8726.6519.734.4911.090.50785023.467198.89813.92384.720110.550130.54389.64513.65810.0126206.6925.45425.79209.9624.3001328.96.79525073222.62246222.89573420.549922177.1126117.7473.7085.369.6324.2112.5329.9087.2812.9830.4994670.94111847.55224.8132.0064.2563.4524.66343.4431.55917.9554.8438.11378.80451.2020.2443.6817.4643.6117.4243.8317.49385.900386.1831866603933137235976724020073653514701837765850747118635929445080700984142769057.1317708200032.3466.6066.51193.80190777.83733.8018679.3611232.268730.881109.7531655.574774.333287.4313949.8910557.16223.2053191.7839179.4494271.924281.589118.16311.27534.2153.4253.9596.8139.1443.8578.239.81023.7439647.4969654.57656.40889063.670819873.05.845.852.857.1125.9322.69137.621086.48361.38203.332271.95659.11246.242665.92819.0512.067814.55113.22956194.26424.5391194.26844.199941.2052969.281236.468365455.176254.392754.39583134554038516042366.34304720.9413.9632.0240545421.81726.410.21636.47954.2183.319454.1994.04264.7465436.816464.127297.49081598467363.900.6870762.874093.3701.7315110.038315.71959.657630734392322.72955.79167.77614.244934.62277786896580170630218221028281868789713316660.10117516797134210671698.12729.56724.643230103230001361333440540006818166723093667278456666732994000001442500000540760000058765666672487400000849473333381519000002855466667795.8369037.730.0110.64385558.53194853.918647.62167026.9388208.3612633454186722755042.3589905537645.973225.8757.302.76126082.865168.7196309.7575.1462940.75065.1453922.658624.26831.029984.228235827.9160.18563910154203797531198223981329.530.4887.146.615.627.2257.573.8826.6719.364.4810.730.51341723.344201.75513.97883.293109.561128.13288.89313.6699.8685574.3635.49222.09178.9323.8221326.26.47804538199.84458921.4414620.597089160.3737817.8773.6777.269.4624.0012.5029.7586.6212.7130.6924410.93489647.46524.7972.0034.2213.4594.61043.0571.55317.9684.8778.17379.020OpenBenchmarking.org

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50DDR5-4800DDR5-60001224364860SE +/- 0.59, N = 3SE +/- 0.42, N = 351.6151.20MIN: 48.6 / MAX: 53.42MIN: 48.71 / MAX: 53.18

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-152DDR5-4800DDR5-6000510152025SE +/- 0.12, N = 3SE +/- 0.07, N = 320.4720.24MIN: 19.75 / MAX: 20.95MIN: 19.69 / MAX: 20.76

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-50DDR5-6000DDR5-48001020304050SE +/- 0.27, N = 3SE +/- 0.05, N = 343.6843.07MIN: 41.7 / MAX: 44.98MIN: 41.33 / MAX: 43.96

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-152DDR5-4800DDR5-600048121620SE +/- 0.16, N = 7SE +/- 0.12, N = 317.7717.46MIN: 16.6 / MAX: 18.47MIN: 16.79 / MAX: 17.99

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50DDR5-6000DDR5-48001020304050SE +/- 0.29, N = 3SE +/- 0.36, N = 343.6143.24MIN: 41.9 / MAX: 45.08MIN: 41.12 / MAX: 44.61

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-152DDR5-4800DDR5-600048121620SE +/- 0.14, N = 3SE +/- 0.19, N = 417.5817.42MIN: 16.88 / MAX: 18.01MIN: 16.53 / MAX: 18.16

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50DDR5-6000DDR5-48001020304050SE +/- 0.13, N = 3SE +/- 0.03, N = 343.8342.84MIN: 41.6 / MAX: 45.11MIN: 41.63 / MAX: 44.31

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-152DDR5-4800DDR5-600048121620SE +/- 0.13, N = 3SE +/- 0.05, N = 317.5317.49MIN: 16.64 / MAX: 17.95MIN: 16.94 / MAX: 18.01

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1DDR5-6000DDR5-480080160240320400SE +/- 2.67, N = 11SE +/- 3.39, N = 15385.90383.901. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2DDR5-4800DDR5-600080160240320400SE +/- 3.02, N = 3SE +/- 3.07, N = 4386.20386.181. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.3Algorithm: SHA256DDR5-4800DDR5-600040000M80000M120000M160000M200000MSE +/- 316551135.30, N = 3SE +/- 76407988.26, N = 31867898199531866603933131. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: SHA512

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.3Algorithm: SHA512DDR5-4800DDR5-600016000M32000M48000M64000M80000MSE +/- 495890924.48, N = 3SE +/- 163780192.60, N = 372936235753723597672401. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: AES-128-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.3Algorithm: AES-128-GCMDDR5-6000DDR5-4800400000M800000M1200000M1600000M2000000MSE +/- 679943093.73, N = 3SE +/- 798384586.01, N = 3200736535147020033130987531. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: AES-256-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.3Algorithm: AES-256-GCMDDR5-4800DDR5-6000400000M800000M1200000M1600000M2000000MSE +/- 686739707.21, N = 3SE +/- 2482404044.78, N = 3183969943652318377658507471. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: ChaCha20

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.3Algorithm: ChaCha20DDR5-4800DDR5-6000300000M600000M900000M1200000M1500000MSE +/- 227575784.55, N = 3SE +/- 507390352.95, N = 3118731514793711863592944501. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: ChaCha20-Poly1305

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.3Algorithm: ChaCha20-Poly1305DDR5-4800DDR5-6000200000M400000M600000M800000M1000000MSE +/- 318743927.24, N = 3SE +/- 89801965.81, N = 38075884256378070098414271. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

miniFE

Problem Size: Small

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallDDR5-6000DDR5-480015K30K45K60K75KSE +/- 78.85, N = 6SE +/- 20.01, N = 369057.159008.41. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

Algebraic Multi-Grid Benchmark

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2DDR5-6000DDR5-4800700M1400M2100M2800M3500MSE +/- 6342005.91, N = 3SE +/- 2882733.14, N = 3317708200026984760001. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

FFmpeg

Encoder: libx265 - Scenario: Upload

OpenBenchmarking.orgFPS, More Is BetterFFmpeg 7.0Encoder: libx265 - Scenario: UploadDDR5-4800DDR5-6000816243240SE +/- 0.02, N = 3SE +/- 0.04, N = 332.5732.341. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

FFmpeg

Encoder: libx265 - Scenario: Platform

OpenBenchmarking.orgFPS, More Is BetterFFmpeg 7.0Encoder: libx265 - Scenario: PlatformDDR5-4800DDR5-60001530456075SE +/- 0.02, N = 3SE +/- 0.06, N = 367.1466.601. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

FFmpeg

Encoder: libx265 - Scenario: Video On Demand

OpenBenchmarking.orgFPS, More Is BetterFFmpeg 7.0Encoder: libx265 - Scenario: Video On DemandDDR5-4800DDR5-60001530456075SE +/- 0.05, N = 3SE +/- 0.01, N = 367.0566.511. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUDDR5-4800DDR5-60004080120160200SE +/- 0.21, N = 3SE +/- 0.32, N = 3194.14193.801. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUDDR5-4800DDR5-600040K80K120K160K200KSE +/- 417.74, N = 3SE +/- 1255.01, N = 3192115.85190777.831. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUDDR5-6000DDR5-4800160320480640800SE +/- 4.77, N = 15SE +/- 1.38, N = 3733.80674.461. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUDDR5-4800DDR5-60004K8K12K16K20KSE +/- 7.25, N = 3SE +/- 13.14, N = 318679.6318679.361. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUDDR5-6000DDR5-48002K4K6K8K10KSE +/- 2.97, N = 3SE +/- 5.50, N = 311232.2611196.671. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUDDR5-6000DDR5-48002K4K6K8K10KSE +/- 8.58, N = 3SE +/- 7.28, N = 38730.888681.621. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUDDR5-6000DDR5-48002004006008001000SE +/- 2.99, N = 3SE +/- 6.77, N = 31109.751043.311. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUDDR5-6000DDR5-48007K14K21K28K35KSE +/- 0.83, N = 3SE +/- 17.88, N = 331655.5731630.961. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUDDR5-4800DDR5-600010002000300040005000SE +/- 1.24, N = 3SE +/- 5.09, N = 34776.664774.331. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUDDR5-6000DDR5-48007001400210028003500SE +/- 2.87, N = 3SE +/- 2.32, N = 33287.433224.721. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUDDR5-6000DDR5-48003K6K9K12K15KSE +/- 14.83, N = 3SE +/- 2.75, N = 313949.8913918.741. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUDDR5-6000DDR5-48002K4K6K8K10KSE +/- 33.49, N = 3SE +/- 39.39, N = 310557.169833.471. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian DragonDDR5-6000DDR5-480050100150200250SE +/- 0.06, N = 8SE +/- 0.10, N = 3223.21220.00MIN: 219.93 / MAX: 227.73MIN: 216.78 / MAX: 224.82

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon Obj

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian Dragon ObjDDR5-6000DDR5-48004080120160200SE +/- 0.14, N = 5SE +/- 0.10, N = 3191.78189.00MIN: 188.36 / MAX: 196.2MIN: 186.02 / MAX: 192.57

Embree

Binary: Pathtracer ISPC - Model: Crown

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: CrownDDR5-6000DDR5-48004080120160200SE +/- 0.11, N = 8SE +/- 0.40, N = 3179.45178.52MIN: 175.01 / MAX: 186.64MIN: 174.12 / MAX: 184.84

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.0Encoder Mode: Preset 13 - Input: Bosphorus 4KDDR5-6000DDR5-480060120180240300SE +/- 7.58, N = 15SE +/- 9.35, N = 12271.92265.851. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 12 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.0Encoder Mode: Preset 12 - Input: Bosphorus 4KDDR5-6000DDR5-480060120180240300SE +/- 2.17, N = 9SE +/- 2.96, N = 5281.59274.071. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.0Encoder Mode: Preset 8 - Input: Bosphorus 4KDDR5-6000DDR5-4800306090120150SE +/- 1.10, N = 7SE +/- 1.05, N = 15118.16117.851. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.0Encoder Mode: Preset 4 - Input: Bosphorus 4KDDR5-6000DDR5-48003691215SE +/- 0.04, N = 4SE +/- 0.05, N = 311.2811.261. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.6Video Input: Bosphorus 4KDDR5-6000DDR5-4800816243240SE +/- 0.11, N = 3SE +/- 0.08, N = 334.2134.081. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Slow

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.2Video Input: Bosphorus 4K - Video Preset: SlowDDR5-4800DDR5-60001224364860SE +/- 0.09, N = 3SE +/- 0.06, N = 553.4353.421. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.2Video Input: Bosphorus 4K - Video Preset: MediumDDR5-4800DDR5-60001224364860SE +/- 0.01, N = 3SE +/- 0.01, N = 553.9953.951. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.2Video Input: Bosphorus 4K - Video Preset: Very FastDDR5-6000DDR5-480020406080100SE +/- 0.49, N = 7SE +/- 1.24, N = 396.8194.711. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

uvg266

Video Input: Bosphorus 4K - Video Preset: Slow

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: SlowDDR5-4800DDR5-6000918273645SE +/- 0.08, N = 3SE +/- 0.09, N = 439.2539.14

uvg266

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: MediumDDR5-4800DDR5-60001020304050SE +/- 0.05, N = 3SE +/- 0.04, N = 443.9643.85

uvg266

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Very FastDDR5-6000DDR5-480020406080100SE +/- 0.14, N = 6SE +/- 0.27, N = 378.2376.35

VVenC

Video Input: Bosphorus 4K - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.11Video Input: Bosphorus 4K - Video Preset: FastDDR5-6000DDR5-48003691215SE +/- 0.028, N = 3SE +/- 0.040, N = 39.8109.7191. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 4K - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.11Video Input: Bosphorus 4K - Video Preset: FasterDDR5-6000DDR5-4800612182430SE +/- 0.02, N = 3SE +/- 0.04, N = 323.7422.881. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1DDR5-6000DDR5-48002K4K6K8K10KSE +/- 66.65, N = 11SE +/- 84.72, N = 159647.509597.531. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2DDR5-4800DDR5-60002K4K6K8K10KSE +/- 75.53, N = 3SE +/- 76.86, N = 49655.089654.581. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateDDR5-6000DDR5-48001326395265SE +/- 0.18, N = 8SE +/- 0.31, N = 356.4155.871. (CC) gcc options: -O3 -march=native -fopenmp

High Performance Conjugate Gradient

X Y Z: 144 144 144 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60DDR5-6000DDR5-48001428425670SE +/- 0.01, N = 3SE +/- 0.01, N = 363.6753.741. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

Xmrig

Variant: GhostRider - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.21Variant: GhostRider - Hash Count: 1MDDR5-6000DDR5-48004K8K12K16K20KSE +/- 912.79, N = 15SE +/- 997.93, N = 1519873.018386.41. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Intel Open Image Denoise

Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.2Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-OnlyDDR5-6000DDR5-48001.3142.6283.9425.2566.57SE +/- 0.01, N = 7SE +/- 0.01, N = 35.845.72

Intel Open Image Denoise

Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.2Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-OnlyDDR5-6000DDR5-48001.31632.63263.94895.26526.5815SE +/- 0.00, N = 7SE +/- 0.00, N = 35.855.73

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.2Run: RTLightmap.hdr.4096x4096 - Device: CPU-OnlyDDR5-6000DDR5-48000.64131.28261.92392.56523.2065SE +/- 0.00, N = 5SE +/- 0.00, N = 32.852.77

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: ResNet-50DDR5-6000DDR5-4800246810SE +/- 0.01, N = 3SE +/- 0.10, N = 157.116.96

TensorFlow

Device: CPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: AlexNetDDR5-4800DDR5-6000612182430SE +/- 0.12, N = 3SE +/- 0.16, N = 726.1225.93

TensorFlow

Device: CPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: GoogLeNetDDR5-4800DDR5-6000510152025SE +/- 0.14, N = 15SE +/- 0.16, N = 1522.9422.69

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: ResNet-50DDR5-6000DDR5-4800306090120150SE +/- 0.26, N = 3SE +/- 1.56, N = 4137.62134.50

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: AlexNetDDR5-6000DDR5-48002004006008001000SE +/- 3.83, N = 6SE +/- 11.67, N = 31086.481071.51

TensorFlow

Device: CPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: GoogLeNetDDR5-6000DDR5-480080160240320400SE +/- 2.87, N = 3SE +/- 3.12, N = 8361.38355.14

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: ResNet-50DDR5-6000DDR5-48004080120160200SE +/- 0.65, N = 3SE +/- 0.77, N = 3203.33195.89

TensorFlow

Device: CPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: AlexNetDDR5-6000DDR5-48005001000150020002500SE +/- 2.78, N = 4SE +/- 16.20, N = 32271.952223.51

TensorFlow

Device: CPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: GoogLeNetDDR5-6000DDR5-4800140280420560700SE +/- 7.40, N = 3SE +/- 6.10, N = 3659.11641.12

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50DDR5-6000DDR5-480050100150200250SE +/- 0.52, N = 3SE +/- 0.47, N = 3246.24231.71

TensorFlow

Device: CPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: AlexNetDDR5-6000DDR5-48006001200180024003000SE +/- 2.64, N = 3SE +/- 3.62, N = 32665.922644.59

TensorFlow

Device: CPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: GoogLeNetDDR5-6000DDR5-48002004006008001000SE +/- 3.19, N = 3SE +/- 3.88, N = 3819.05780.73

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: StandardDDR5-4800DDR5-60003691215SE +/- 0.01, N = 3SE +/- 0.05, N = 312.0912.071. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: ParallelDDR5-6000DDR5-480048121620SE +/- 0.02, N = 3SE +/- 0.09, N = 314.5514.421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: ParallelDDR5-4800DDR5-60000.72761.45522.18282.91043.638SE +/- 0.04063, N = 3SE +/- 0.04511, N = 33.233673.229561. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: ParallelDDR5-4800DDR5-60004080120160200SE +/- 0.63, N = 3SE +/- 0.21, N = 3195.17194.261. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: ParallelDDR5-4800DDR5-6000612182430SE +/- 0.04, N = 3SE +/- 0.13, N = 324.5924.541. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: StandardDDR5-6000DDR5-48004080120160200SE +/- 0.38, N = 3SE +/- 0.96, N = 3194.27192.351. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: StandardDDR5-6000DDR5-48001020304050SE +/- 0.48, N = 15SE +/- 0.36, N = 1544.2043.631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: ParallelDDR5-4800DDR5-6000918273645SE +/- 0.24, N = 3SE +/- 0.20, N = 341.3741.211. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: ParallelDDR5-6000DDR5-48002004006008001000SE +/- 1.72, N = 3SE +/- 7.84, N = 3969.28957.061. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: StandardDDR5-6000DDR5-480050100150200250SE +/- 1.36, N = 3SE +/- 1.66, N = 3236.47234.141. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenVKL

Benchmark: vklBenchmarkCPU ISPC

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU ISPCDDR5-6000DDR5-48008001600240032004000SE +/- 0.88, N = 3SE +/- 1.45, N = 336543614MIN: 293 / MAX: 42376MIN: 293 / MAX: 42496

OSPRay

Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.1Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeDDR5-6000DDR5-48001224364860SE +/- 0.01, N = 3SE +/- 0.02, N = 355.1854.96

OSPRay

Benchmark: particle_volume/ao/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.1Benchmark: particle_volume/ao/real_timeDDR5-6000DDR5-48001224364860SE +/- 0.02, N = 3SE +/- 0.01, N = 354.3954.38

OSPRay

Benchmark: particle_volume/scivis/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.1Benchmark: particle_volume/scivis/real_timeDDR5-4800DDR5-60001224364860SE +/- 0.02, N = 3SE +/- 0.01, N = 354.5754.40

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: Noise-GaussianDDR5-4800DDR5-600070140210280350SE +/- 0.33, N = 3SE +/- 0.58, N = 33143131. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: EnhancedDDR5-4800DDR5-6000100200300400500SE +/- 0.58, N = 3SE +/- 0.33, N = 34574551. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: SharpenDDR5-4800DDR5-600090180270360450SE +/- 0.67, N = 3SE +/- 0.33, N = 34064031. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: SwirlDDR5-4800DDR5-60002004006008001000SE +/- 1.53, N = 3SE +/- 0.33, N = 38658511. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per SecondDDR5-6000DDR5-48001.3M2.6M3.9M5.2M6.5MSE +/- 10789.95, N = 3SE +/- 24560.32, N = 36042366.346019991.221. (CC) gcc options: -O2 -lrt" -lrt

LuxCoreRender

Scene: DLSC - Acceleration: CPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: CPUDDR5-4800DDR5-6000510152025SE +/- 0.38, N = 15SE +/- 0.33, N = 1521.2020.94MIN: 19.48 / MAX: 27.07MIN: 19.5 / MAX: 27

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: CPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: CPUDDR5-6000DDR5-480048121620SE +/- 0.14, N = 15SE +/- 0.15, N = 1513.9613.93MIN: 6.55 / MAX: 16.95MIN: 6.46 / MAX: 16.83

LuxCoreRender

Scene: Orange Juice - Acceleration: CPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: CPUDDR5-6000DDR5-4800714212835SE +/- 0.52, N = 15SE +/- 0.41, N = 1532.0230.83MIN: 26.48 / MAX: 43.04MIN: 26.09 / MAX: 42.8

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSDDR5-4800DDR5-600090K180K270K360K450KSE +/- 2301.56, N = 3SE +/- 809.78, N = 34077594054541. (CC) gcc options: -pedantic -O3

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression SpeedDDR5-6000DDR5-4800510152025SE +/- 0.12, N = 3SE +/- 0.06, N = 321.821.41. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression SpeedDDR5-6000DDR5-4800400800120016002000SE +/- 8.42, N = 3SE +/- 6.51, N = 31726.41721.51. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression SpeedDDR5-6000DDR5-48003691215SE +/- 0.00, N = 3SE +/- 0.06, N = 310.210.21. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression SpeedDDR5-6000DDR5-4800400800120016002000SE +/- 1.50, N = 3SE +/- 4.80, N = 31636.41627.71. (CC) gcc options: -O3 -pthread -lz -llzma

srsRAN Project

Test: PUSCH Processor Benchmark, Throughput Total

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.10.1-20240325Test: PUSCH Processor Benchmark, Throughput TotalDDR5-4800DDR5-60002K4K6K8K10KSE +/- 89.89, N = 15SE +/- 0.32, N = 38056.37954.2MIN: 5096.4 / MAX: 8588.6MIN: 5432.6 / MAX: 7954.61. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl

srsRAN Project

Test: PUSCH Processor Benchmark, Throughput Thread

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.10.1-20240325Test: PUSCH Processor Benchmark, Throughput ThreadDDR5-6000DDR5-48004080120160200SE +/- 0.00, N = 4SE +/- 0.00, N = 3183.3183.3MIN: 105.8MIN: 105.81. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl

srsRAN Project

Test: PDSCH Processor Benchmark, Throughput Total

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.10.1-20240325Test: PDSCH Processor Benchmark, Throughput TotalDDR5-4800DDR5-60004K8K12K16K20KSE +/- 180.03, N = 3SE +/- 214.06, N = 319592.619454.11. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl

srsRAN Project

Test: PDSCH Processor Benchmark, Throughput Thread

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.10.1-20240325Test: PDSCH Processor Benchmark, Throughput ThreadDDR5-6000DDR5-48002004006008001000SE +/- 2.19, N = 9SE +/- 10.21, N = 5994.0990.41. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl

QuantLib

Configuration: Single-Threaded

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Single-ThreadedDDR5-6000DDR5-48009001800270036004500SE +/- 40.37, N = 3SE +/- 45.37, N = 34264.74252.21. (CXX) g++ options: -O3 -march=native -fPIE -pie

QuantLib

Configuration: Multi-Threaded

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Multi-ThreadedDDR5-6000DDR5-4800100K200K300K400K500KSE +/- 401.79, N = 3SE +/- 649.08, N = 3465436.8464193.21. (CXX) g++ options: -O3 -march=native -fPIE -pie

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - GriddingDDR5-6000DDR5-48004K8K12K16K20KSE +/- 0.00, N = 3SE +/- 9.34, N = 316464.114143.31. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - DegriddingDDR5-6000DDR5-48006K12K18K24K30KSE +/- 28.93, N = 3SE +/- 10.95, N = 327297.422665.11. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression RatingDDR5-6000DDR5-4800200K400K600K800K1000KSE +/- 3530.77, N = 3SE +/- 2814.63, N = 39081598829361. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression RatingDDR5-6000DDR5-4800200K400K600K800K1000KSE +/- 276.36, N = 3SE +/- 719.43, N = 38467368456311. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Highest CompressionDDR5-6000DDR5-48000.87751.7552.63253.514.3875SE +/- 0.01, N = 7SE +/- 0.01, N = 33.903.881. (CC) gcc options: -fvisibility=hidden -O2 -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Lossless, Highest CompressionDDR5-6000DDR5-48000.1530.3060.4590.6120.765SE +/- 0.00, N = 3SE +/- 0.00, N = 30.680.671. (CC) gcc options: -fvisibility=hidden -O2 -lm

ASKAP

Test: tConvolve MPI - Degridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingDDR5-6000DDR5-480015K30K45K60K75KSE +/- 395.30, N = 3SE +/- 610.89, N = 370762.866651.41. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MPI - Gridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingDDR5-6000DDR5-480016K32K48K64K80KSE +/- 438.43, N = 3SE +/- 752.40, N = 374093.368467.41. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.7Preset: MediumDDR5-6000DDR5-4800150300450600750SE +/- 3.94, N = 8SE +/- 5.72, N = 3701.73697.191. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.7Preset: ThoroughDDR5-6000DDR5-480020406080100SE +/- 0.02, N = 6SE +/- 0.01, N = 3110.04110.031. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Very Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.7Preset: Very ThoroughDDR5-6000DDR5-480048121620SE +/- 0.00, N = 3SE +/- 0.01, N = 315.7215.711. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.7Preset: ExhaustiveDDR5-6000DDR5-48003691215SE +/- 0.0004, N = 3SE +/- 0.0032, N = 39.65769.65761. (CXX) g++ options: -O3 -flto -pthread

Stockfish

Chess Benchmark

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 16.1Chess BenchmarkDDR5-4800DDR5-600070M140M210M280M350MSE +/- 1906957.54, N = 3SE +/- 4338134.07, N = 153086925813073439231. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareDDR5-6000DDR5-4800510152025SE +/- 0.04, N = 3SE +/- 0.02, N = 322.7322.511. (CXX) g++ options: -O3 -lm

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin ProteinDDR5-6000DDR5-48001326395265SE +/- 0.31, N = 10SE +/- 0.61, N = 355.7955.151. (CXX) g++ options: -O3 -lm -ldl

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsDDR5-4800DDR5-60001530456075SE +/- 0.58, N = 3SE +/- 0.06, N = 368.4867.781. (CXX) g++ options: -O3 -lm -ldl

NAMD

Input: ATPase with 327,506 Atoms

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0b6Input: ATPase with 327,506 AtomsDDR5-6000DDR5-480048121620SE +/- 0.03, N = 7SE +/- 0.02, N = 314.2414.04

NAMD

Input: STMV with 1,066,628 Atoms

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0b6Input: STMV with 1,066,628 AtomsDDR5-6000DDR5-48001.04012.08023.12034.16045.2005SE +/- 0.00697, N = 3SE +/- 0.01366, N = 34.622774.54707

RocksDB

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Random ReadDDR5-4800DDR5-6000200M400M600M800M1000MSE +/- 225348.54, N = 3SE +/- 5260460.78, N = 37933903857868965801. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Read While WritingDDR5-6000DDR5-48004M8M12M16M20MSE +/- 241581.30, N = 15SE +/- 233412.71, N = 1517063021168171111. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random ReadDDR5-6000DDR5-4800200M400M600M800M1000MSE +/- 707549.54, N = 3SE +/- 4175835.13, N = 38221028288185567681. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read While WritingDDR5-4800DDR5-60004M8M12M16M20MSE +/- 78558.58, N = 3SE +/- 433421.14, N = 1218826431186878971. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Memcached

Set To Get Ratio: 1:100

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.19Set To Get Ratio: 1:100DDR5-4800DDR5-60003M6M9M12M15MSE +/- 14934.26, N = 3SE +/- 136953.60, N = 313455536.7213316660.101. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Apache IoTDB

Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.2Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400DDR5-4800DDR5-600030M60M90M120M150MSE +/- 266876.94, N = 3SE +/- 120972.23, N = 3118270441117516797

Apache IoTDB

Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.2Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100DDR5-6000DDR5-480030M60M90M120M150MSE +/- 338573.54, N = 3SE +/- 33011.72, N = 3134210671133128010

ClickHouse

100M Rows Hits Dataset, First Run / Cold Cache

OpenBenchmarking.orgQueries Per Minute, Geo Mean, More Is BetterClickHouse 22.12.3.5100M Rows Hits Dataset, First Run / Cold CacheDDR5-6000DDR5-4800150300450600750SE +/- 6.12, N = 3SE +/- 4.99, N = 3698.12680.81MIN: 80.86 / MAX: 6666.67MIN: 82.42 / MAX: 6666.67

ClickHouse

100M Rows Hits Dataset, Second Run

OpenBenchmarking.orgQueries Per Minute, Geo Mean, More Is BetterClickHouse 22.12.3.5100M Rows Hits Dataset, Second RunDDR5-6000DDR5-4800160320480640800SE +/- 5.27, N = 3SE +/- 2.33, N = 3729.56693.77MIN: 81.63 / MAX: 7500MIN: 84.27 / MAX: 6666.67

ClickHouse

100M Rows Hits Dataset, Third Run

OpenBenchmarking.orgQueries Per Minute, Geo Mean, More Is BetterClickHouse 22.12.3.5100M Rows Hits Dataset, Third RunDDR5-6000DDR5-4800160320480640800SE +/- 4.70, N = 3SE +/- 2.37, N = 3724.64704.36MIN: 85.11 / MAX: 6666.67MIN: 83.45 / MAX: 6666.67

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: BlowfishDDR5-6000DDR5-480070K140K210K280K350KSE +/- 53.35, N = 3SE +/- 53.65, N = 33230103223961. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2

John The Ripper

Test: bcrypt

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: bcryptDDR5-6000DDR5-480070K140K210K280K350KSE +/- 45.32, N = 3SE +/- 44.46, N = 33230003224061. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2

John The Ripper

Test: WPA PSK

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: WPA PSKDDR5-6000DDR5-4800300K600K900K1200K1500KSE +/- 1201.85, N = 3SE +/- 1527.53, N = 3136133313600001. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 32DDR5-4800DDR5-60009M18M27M36M45MSE +/- 52003.21, N = 3SE +/- 3511.88, N = 344063000440540001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 57DDR5-4800DDR5-600015M30M45M60M75MSE +/- 65397.08, N = 3SE +/- 98289.26, N = 368321333681816671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 512DDR5-4800DDR5-60005M10M15M20M25MSE +/- 333.33, N = 3SE +/- 13860.42, N = 323164333230936671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32DDR5-6000DDR5-4800600M1200M1800M2400M3000MSE +/- 233333.33, N = 3SE +/- 352766.84, N = 3278456666727831333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57DDR5-6000DDR5-4800700M1400M2100M2800M3500MSE +/- 5392896.56, N = 3SE +/- 693621.73, N = 3329940000032989333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512DDR5-4800DDR5-6000300M600M900M1200M1500MSE +/- 1386041.53, N = 3SE +/- 1960442.13, N = 3144626666714425000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 32DDR5-4800DDR5-60001200M2400M3600M4800M6000MSE +/- 166666.67, N = 3SE +/- 9856131.76, N = 3542413333354076000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 57DDR5-6000DDR5-48001300M2600M3900M5200M6500MSE +/- 20311928.62, N = 3SE +/- 19352605.34, N = 3587656666758348000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 512DDR5-6000DDR5-4800500M1000M1500M2000M2500MSE +/- 1305118.13, N = 3SE +/- 1474599.76, N = 3248740000024845333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 256 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 256 - Buffer Length: 256 - Filter Length: 32DDR5-6000DDR5-48002000M4000M6000M8000M10000MSE +/- 14339494.80, N = 3SE +/- 24860209.17, N = 3849473333384631000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 256 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 256 - Buffer Length: 256 - Filter Length: 57DDR5-4800DDR5-60002000M4000M6000M8000M10000MSE +/- 3233333.33, N = 3SE +/- 6331139.97, N = 3819616666781519000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 256 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 256 - Buffer Length: 256 - Filter Length: 512DDR5-4800DDR5-6000600M1200M1800M2400M3000MSE +/- 1059874.21, N = 3SE +/- 3925274.23, N = 3285560000028554666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Numpy Benchmark

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkDDR5-4800DDR5-60002004006008001000SE +/- 1.53, N = 3SE +/- 1.03, N = 3796.77795.83

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.3Algorithm: RSA4096DDR5-4800DDR5-600015K30K45K60K75KSE +/- 80.51, N = 3SE +/- 15.58, N = 369052.569037.71. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

Llamafile

Test: llava-v1.5-7b-q4 - Acceleration: CPU

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.7Test: llava-v1.5-7b-q4 - Acceleration: CPUDDR5-6000DDR5-4800714212835SE +/- 0.32, N = 5SE +/- 0.23, N = 330.0129.54

Llamafile

Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.7Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPUDDR5-6000DDR5-48003691215SE +/- 0.06, N = 3SE +/- 0.01, N = 310.649.71

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.CDDR5-6000DDR5-480080K160K240K320K400KSE +/- 5529.34, N = 15SE +/- 5209.32, N = 12385558.53383603.911. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: SP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CDDR5-6000DDR5-480040K80K120K160K200KSE +/- 878.07, N = 5SE +/- 1358.70, N = 3194853.91172595.771. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: IS.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.DDDR5-6000DDR5-48002K4K6K8K10KSE +/- 55.66, N = 6SE +/- 34.68, N = 38647.627995.281. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: MG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.CDDR5-6000DDR5-480040K80K120K160K200KSE +/- 1302.88, N = 15SE +/- 239.36, N = 3167026.93149191.031. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: CG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.CDDR5-4800DDR5-600020K40K60K80K100KSE +/- 1060.91, N = 3SE +/- 618.06, N = 1588719.5388208.361. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read WriteDDR5-4800DDR5-600030K60K90K120K150KSE +/- 154.18, N = 3SE +/- 68.35, N = 31273101263341. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read OnlyDDR5-6000DDR5-48001.2M2.4M3.6M4.8M6MSE +/- 39392.13, N = 3SE +/- 27253.44, N = 3541867253645771. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.3Algorithm: RSA4096DDR5-4800DDR5-6000600K1200K1800K2400K3000KSE +/- 282.94, N = 3SE +/- 331.00, N = 32757741.02755042.31. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

BRL-CAD

VGR Performance Metric

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.38.2VGR Performance MetricDDR5-6000DDR5-48001.3M2.6M3.9M5.2M6.5M589905558483801. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6

LULESH

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3DDR5-6000DDR5-48008K16K24K32K40KSE +/- 450.61, N = 15SE +/- 386.37, N = 1537645.9735386.941. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

Apache IoTDB

Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.2Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400DDR5-4800DDR5-600050100150200250SE +/- 1.40, N = 3SE +/- 0.92, N = 3225.70225.87MAX: 26697.41MAX: 26595.42

Apache IoTDB

Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100

OpenBenchmarking.orgAverage Latency, Fewer Is BetterApache IoTDB 1.2Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100DDR5-6000DDR5-48001326395265SE +/- 0.13, N = 3SE +/- 0.13, N = 357.3057.58MAX: 23846.01MAX: 23818.56

Pennant

Test: leblancbig

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbigDDR5-6000DDR5-48000.64271.28541.92812.57083.2135SE +/- 0.045208, N = 15SE +/- 0.001976, N = 32.7612602.8565281. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: StandardDDR5-4800DDR5-600020406080100SE +/- 0.05, N = 3SE +/- 0.33, N = 382.7282.871. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: ParallelDDR5-6000DDR5-48001530456075SE +/- 0.11, N = 3SE +/- 0.42, N = 368.7269.361. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: ParallelDDR5-4800DDR5-600070140210280350SE +/- 3.94, N = 3SE +/- 4.35, N = 3309.34309.761. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: ParallelDDR5-4800DDR5-60001.15792.31583.47374.63165.7895SE +/- 0.01638, N = 3SE +/- 0.00564, N = 35.122555.146291. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: ParallelDDR5-4800DDR5-6000918273645SE +/- 0.07, N = 3SE +/- 0.21, N = 340.6740.751. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: StandardDDR5-6000DDR5-48001.16932.33863.50794.67725.8465SE +/- 0.01008, N = 3SE +/- 0.02595, N = 35.145395.196721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: StandardDDR5-6000DDR5-4800510152025SE +/- 0.24, N = 15SE +/- 0.18, N = 1522.6622.941. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: ParallelDDR5-4800DDR5-6000612182430SE +/- 0.14, N = 3SE +/- 0.12, N = 324.1724.271. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: ParallelDDR5-6000DDR5-48000.23470.46940.70410.93881.1735SE +/- 0.00186, N = 3SE +/- 0.00860, N = 31.029981.043311. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: StandardDDR5-6000DDR5-48000.96081.92162.88243.84324.804SE +/- 0.02413, N = 3SE +/- 0.03015, N = 34.228234.270071. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

PyBench

Total For Average Test Times

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyBench 2018-02-16Total For Average Test TimesDDR5-4800DDR5-6000130260390520650SE +/- 3.93, N = 3SE +/- 1.70, N = 4582582

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average LatencyDDR5-4800DDR5-6000246810SE +/- 0.010, N = 3SE +/- 0.004, N = 37.8557.9161. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average LatencyDDR5-6000DDR5-48000.04210.08420.12630.16840.2105SE +/- 0.001, N = 3SE +/- 0.001, N = 30.1850.1871. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUDDR5-6000DDR5-4800140280420560700SE +/- 0.33, N = 3SE +/- 0.33, N = 3639642

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUDDR5-6000DDR5-48002K4K6K8K10KSE +/- 3.71, N = 3SE +/- 7.81, N = 31015410200

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUDDR5-6000DDR5-48004K8K12K16K20KSE +/- 39.68, N = 3SE +/- 13.58, N = 32037920428

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUDDR5-6000DDR5-4800160320480640800SE +/- 0.58, N = 3SE +/- 0.67, N = 3753758

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUDDR5-6000DDR5-48003K6K9K12K15KSE +/- 14.19, N = 3SE +/- 11.86, N = 31198212063

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUDDR5-6000DDR5-48005K10K15K20K25KSE +/- 32.92, N = 3SE +/- 51.91, N = 32398124131

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUDDR5-4800DDR5-600070140210280350SE +/- 0.37, N = 3SE +/- 0.54, N = 3328.94329.53MIN: 143.56 / MAX: 354.17MIN: 165.13 / MAX: 354.631. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUDDR5-4800DDR5-60000.1080.2160.3240.4320.54SE +/- 0.00, N = 3SE +/- 0.01, N = 30.480.48MIN: 0.13 / MAX: 26.21MIN: 0.13 / MAX: 26.241. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUDDR5-6000DDR5-480020406080100SE +/- 0.54, N = 15SE +/- 0.19, N = 387.1494.77MIN: 35.35 / MAX: 200.15MIN: 40.52 / MAX: 153.331. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUDDR5-4800DDR5-6000246810SE +/- 0.01, N = 3SE +/- 0.01, N = 36.616.61MIN: 2.24 / MAX: 27.98MIN: 2.25 / MAX: 28.151. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUDDR5-6000DDR5-48001.26682.53363.80045.06726.334SE +/- 0.00, N = 3SE +/- 0.00, N = 35.625.63MIN: 2.01 / MAX: 29.21MIN: 2.39 / MAX: 30.131. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUDDR5-6000DDR5-4800246810SE +/- 0.01, N = 3SE +/- 0.01, N = 37.227.27MIN: 4.48 / MAX: 26.34MIN: 4.1 / MAX: 24.921. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUDDR5-6000DDR5-48001428425670SE +/- 0.16, N = 3SE +/- 0.40, N = 357.5761.24MIN: 34.99 / MAX: 106.02MIN: 27.85 / MAX: 106.261. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUDDR5-4800DDR5-60000.8731.7462.6193.4924.365SE +/- 0.01, N = 3SE +/- 0.00, N = 33.873.88MIN: 1.54 / MAX: 25.35MIN: 1.55 / MAX: 24.231. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUDDR5-4800DDR5-6000612182430SE +/- 0.00, N = 3SE +/- 0.03, N = 326.6526.67MIN: 15.32 / MAX: 48.24MIN: 15.22 / MAX: 47.891. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUDDR5-6000DDR5-4800510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 319.3619.73MIN: 9.99 / MAX: 44.41MIN: 9.5 / MAX: 45.51. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUDDR5-6000DDR5-48001.01032.02063.03094.04125.0515SE +/- 0.00, N = 3SE +/- 0.00, N = 34.484.49MIN: 1.99 / MAX: 22.31MIN: 1.96 / MAX: 20.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUDDR5-6000DDR5-48003691215SE +/- 0.05, N = 3SE +/- 0.01, N = 310.7311.09MIN: 6.4 / MAX: 32.23MIN: 6.01 / MAX: 31.791. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.4Harness: Deconvolution Batch shapes_3d - Engine: CPUDDR5-4800DDR5-60000.11550.2310.34650.4620.5775SE +/- 0.000838, N = 3SE +/- 0.004859, N = 150.5078500.513417MIN: 0.48MIN: 0.481. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: defconfigDDR5-6000DDR5-4800612182430SE +/- 0.13, N = 3SE +/- 0.19, N = 323.3423.47

Timed Linux Kernel Compilation

Build: allmodconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: allmodconfigDDR5-4800DDR5-60004080120160200SE +/- 1.43, N = 3SE +/- 0.88, N = 3198.90201.76

Timed FFmpeg Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 7.0Time To CompileDDR5-4800DDR5-600048121620SE +/- 0.05, N = 3SE +/- 0.07, N = 413.9213.98

Timed Godot Game Engine Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Godot Game Engine Compilation 4.0Time To CompileDDR5-6000DDR5-480020406080100SE +/- 0.10, N = 3SE +/- 0.27, N = 383.2984.72

Timed Node.js Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 21.7.2Time To CompileDDR5-6000DDR5-480020406080100SE +/- 0.21, N = 3SE +/- 0.13, N = 3109.56110.55

Timed Gem5 Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 23.0.1Time To CompileDDR5-6000DDR5-4800306090120150SE +/- 1.52, N = 3SE +/- 1.70, N = 3128.13130.54

Timed LLVM Compilation

Build System: Ninja

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaDDR5-6000DDR5-480020406080100SE +/- 0.32, N = 3SE +/- 0.30, N = 388.8989.65

Timed Mesa Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Mesa Compilation 24.0Time To CompileDDR5-4800DDR5-600048121620SE +/- 0.07, N = 3SE +/- 0.07, N = 413.6613.67

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To CompileDDR5-6000DDR5-48003691215SE +/- 0.013, N = 5SE +/- 0.051, N = 39.86810.012

WRF

Input: conus 2.5km

OpenBenchmarking.orgSeconds, Fewer Is BetterWRF 4.2.2Input: conus 2.5kmDDR5-6000DDR5-4800130026003900520065005574.366206.691. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

m-queens

Time To Solve

OpenBenchmarking.orgSeconds, Fewer Is Betterm-queens 1.2Time To SolveDDR5-4800DDR5-60001.23572.47143.70714.94286.1785SE +/- 0.036, N = 3SE +/- 0.025, N = 75.4545.4921. (CXX) g++ options: -fopenmp -O2 -march=native

CloverLeaf

Input: clover_bm64_short

OpenBenchmarking.orgSeconds, Fewer Is BetterCloverLeaf 1.3Input: clover_bm64_shortDDR5-6000DDR5-4800612182430SE +/- 0.07, N = 3SE +/- 0.11, N = 322.0925.791. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp

CloverLeaf

Input: clover_bm16

OpenBenchmarking.orgSeconds, Fewer Is BetterCloverLeaf 1.3Input: clover_bm16DDR5-6000DDR5-480050100150200250SE +/- 0.24, N = 3SE +/- 0.08, N = 3178.93209.961. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon NanotubeDDR5-6000DDR5-4800612182430SE +/- 0.10, N = 3SE +/- 0.13, N = 323.8224.301. (CC) gcc options: -shared -lxc -lblas -lmpi

NWChem

Input: C240 Buckyball

OpenBenchmarking.orgSeconds, Fewer Is BetterNWChem 7.0.2Input: C240 BuckyballDDR5-6000DDR5-4800300600900120015001326.21328.91. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionDDR5-6000DDR5-4800246810SE +/- 0.01274608, N = 6SE +/- 0.01024398, N = 36.478045386.795250731. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Xcompact3d Incompact3d

Input: X3D-benchmarking input.i3d

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: X3D-benchmarking input.i3dDDR5-6000DDR5-480050100150200250SE +/- 0.78, N = 3SE +/- 0.04, N = 3199.84222.621. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh TimeDDR5-6000DDR5-480051015202521.4422.901. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeDDR5-4800DDR5-600051015202520.5520.601. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution TimeDDR5-6000DDR5-48004080120160200160.37177.111. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenRadioss

Model: Cell Phone Drop Test

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Cell Phone Drop TestDDR5-4800DDR5-600048121620SE +/- 0.12, N = 3SE +/- 0.16, N = 317.7417.87

OpenRadioss

Model: INIVOL and Fluid Structure Interaction Drop Container

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: INIVOL and Fluid Structure Interaction Drop ContainerDDR5-6000DDR5-48001632486480SE +/- 0.07, N = 3SE +/- 0.35, N = 373.6773.70

OpenRadioss

Model: Chrysler Neon 1M

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Chrysler Neon 1MDDR5-6000DDR5-480020406080100SE +/- 0.11, N = 3SE +/- 0.19, N = 377.2685.36

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: BMW27 - Compute: CPU-OnlyDDR5-6000DDR5-48003691215SE +/- 0.01, N = 5SE +/- 0.08, N = 39.469.63

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Classroom - Compute: CPU-OnlyDDR5-6000DDR5-4800612182430SE +/- 0.05, N = 3SE +/- 0.04, N = 324.0024.21

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Fishy Cat - Compute: CPU-OnlyDDR5-6000DDR5-48003691215SE +/- 0.01, N = 4SE +/- 0.05, N = 312.5012.53

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Pabellon Barcelona - Compute: CPU-OnlyDDR5-6000DDR5-4800714212835SE +/- 0.06, N = 3SE +/- 0.10, N = 329.7529.90

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Barbershop - Compute: CPU-OnlyDDR5-6000DDR5-480020406080100SE +/- 0.12, N = 3SE +/- 0.08, N = 386.6287.28

Blender

Blend File: Junkshop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Junkshop - Compute: CPU-OnlyDDR5-6000DDR5-48003691215SE +/- 0.03, N = 4SE +/- 0.08, N = 312.7112.98

Appleseed

Scene: Disney Material

OpenBenchmarking.orgSeconds, Fewer Is BetterAppleseed 2.0 BetaScene: Disney MaterialDDR5-4800DDR5-600071421283530.5030.69

Parallel BZIP2 Compression

FreeBSD-13.0-RELEASE-amd64-memstick.img Compression

OpenBenchmarking.orgSeconds, Fewer Is BetterParallel BZIP2 Compression 1.1.13FreeBSD-13.0-RELEASE-amd64-memstick.img CompressionDDR5-6000DDR5-48000.21180.42360.63540.84721.059SE +/- 0.010948, N = 15SE +/- 0.009629, N = 150.9348960.9411181. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread

libavif avifenc

Encoder Speed: 0

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 0DDR5-6000DDR5-48001122334455SE +/- 0.03, N = 3SE +/- 0.02, N = 347.4747.551. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 2DDR5-6000DDR5-4800612182430SE +/- 0.03, N = 3SE +/- 0.01, N = 324.8024.811. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 6DDR5-6000DDR5-48000.45140.90281.35421.80562.257SE +/- 0.006, N = 22.0032.0061. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 6, LosslessDDR5-6000DDR5-48000.95761.91522.87283.83044.788SE +/- 0.006, N = 8SE +/- 0.010, N = 34.2214.2561. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 10, LosslessDDR5-4800DDR5-60000.77831.55662.33493.11323.8915SE +/- 0.005, N = 3SE +/- 0.003, N = 93.4523.4591. (CXX) g++ options: -O3 -fPIC -lm

GNU Octave Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterGNU Octave Benchmark 8.4.0DDR5-6000DDR5-48001.04922.09843.14764.19685.246SE +/- 0.010, N = 8SE +/- 0.010, N = 54.6104.663

Helsing

Digit Range: 14 digit

OpenBenchmarking.orgSeconds, Fewer Is BetterHelsing 1.0-betaDigit Range: 14 digitDDR5-6000DDR5-48001020304050SE +/- 0.13, N = 3SE +/- 0.11, N = 343.0643.441. (CC) gcc options: -O2 -pthread

Primesieve

Length: 1e12

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 12.1Length: 1e12DDR5-6000DDR5-48000.35080.70161.05241.40321.754SE +/- 0.004, N = 12SE +/- 0.013, N = 31.5531.5591. (CXX) g++ options: -O3

Primesieve

Length: 1e13

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 12.1Length: 1e13DDR5-4800DDR5-600048121620SE +/- 0.01, N = 3SE +/- 0.02, N = 317.9617.971. (CXX) g++ options: -O3

Y-Cruncher

Pi Digits To Calculate: 500M

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.3Pi Digits To Calculate: 500MDDR5-4800DDR5-60001.09732.19463.29194.38925.4865SE +/- 0.019, N = 3SE +/- 0.005, N = 64.8434.877

Y-Cruncher

Pi Digits To Calculate: 1B

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.3Pi Digits To Calculate: 1BDDR5-4800DDR5-6000246810SE +/- 0.013, N = 3SE +/- 0.015, N = 58.1138.173

QMCPACK

Input: Li2_STO_ae

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.17.1Input: Li2_STO_aeDDR5-4800DDR5-600020406080100SE +/- 0.54, N = 3SE +/- 0.27, N = 378.8079.021. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl


Phoronix Test Suite v10.8.5