AMD Threadripper 7995WX NPS / SNC2 SNC4 Benchmarks

AMD Ryzen Threadripper PRO 7995WX 96-Cores testing of NPS/SNC settings with default (disabled), SNC2, and SNC4 modes. Benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2311288-NE-TR7995WXN68&grr&rdt.

AMD Threadripper 7995WX NPS / SNC2 SNC4 BenchmarksProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionDefault - DisabledSNC2SNC4AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads)HP 8B24 (U65 Ver. 01.01.04 BIOS)AMD Device 14a4128GB2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1NVIDIA RTX A4000 16GBNVIDIA GA104 HD AudioASUS VP28URealtek RTL8111/8168/8411Ubuntu 23.106.5.0-13-generic (x86_64)GNOME Shell 45.0X Server 1.21.1.7NVIDIA 535.129.034.6.0OpenCL 3.0 CUDA 12.2.147GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105OpenCL Details- GPU Compute Cores: 6144Python Details- Python 3.11.6Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AMD Threadripper 7995WX NPS / SNC2 SNC4 Benchmarksgpaw: Carbon Nanotubepetsc: Streamsopenvkl: vklBenchmarkCPU ISPCgromacs: MPI CPU - water_GMX50_barepytorch: CPU - 16 - Efficientnet_v2_lpgbench: 100 - 1000 - Read Write - Average Latencypgbench: 100 - 1000 - Read Writeclickhouse: 100M Rows Hits Dataset, Third Runclickhouse: 100M Rows Hits Dataset, Second Runclickhouse: 100M Rows Hits Dataset, First Run / Cold Cachepytorch: CPU - 64 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_llibxsmm: 128tensorflow: CPU - 512 - ResNet-50pgbench: 1000 - 1000 - Read Write - Average Latencypgbench: 1000 - 1000 - Read Writecloverleaf: clover_bm16qe: AUSURF112pgbench: 1000 - 1000 - Read Only - Average Latencypgbench: 1000 - 1000 - Read Onlytensorflow: CPU - 256 - ResNet-50stockfish: Total Timeluxcorerender: Orange Juice - CPUjohn-the-ripper: MD5rodinia: OpenMP HotSpot3Daskap: tConvolve MT - Degriddingaskap: tConvolve MT - Griddinggraph500: 26graph500: 26graph500: 26graph500: 26build-linux-kernel: allmodconfigjohn-the-ripper: HMAC-SHA512pgbench: 100 - 1000 - Read Only - Average Latencypgbench: 100 - 1000 - Read Onlyeasywave: e2Asean Grid + BengkuluSept2007 Source - 2400openradioss: Chrysler Neon 1Mopenssl: AES-128-GCMopenssl: SHA256openssl: SHA512openssl: ChaCha20-Poly1305openssl: AES-256-GCMopenssl: ChaCha20libxsmm: 256build-gem5: Time To Compilebuild-llvm: Unix Makefilespytorch: CPU - 16 - ResNet-152npb: SP.Copenfoam: drivaerFastback, Medium Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timeblender: Barbershop - CPU-Onlypytorch: CPU - 64 - ResNet-152pytorch: CPU - 32 - ResNet-152build-nodejs: Time To Compileeasywave: e2Asean Grid + BengkuluSept2007 Source - 1200build-llvm: Ninjapytorch: CPU - 1 - Efficientnet_v2_lqmcpack: Li2_STO_aeaskap: tConvolve MPI - Griddingaskap: tConvolve MPI - Degriddingasmfish: 1024 Hash Memory, 26 Depthnumpy: john-the-ripper: Blowfishtensorflow: CPU - 64 - ResNet-50specfem3d: Tomographic Modelvvenc: Bosphorus 4K - Fastjohn-the-ripper: bcryptospray-studio: 3 - 4K - 16 - Path Tracer - CPUmemcached: 1:100memcached: 1:10memcached: 1:5ospray-studio: 3 - 4K - 1 - Path Tracer - CPUospray-studio: 1 - 4K - 1 - Path Tracer - CPUospray-studio: 2 - 4K - 1 - Path Tracer - CPUospray-studio: 2 - 4K - 16 - Path Tracer - CPUospray-studio: 1 - 4K - 16 - Path Tracer - CPUopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUluxcorerender: Danish Mood - CPUpytorch: CPU - 1 - ResNet-152build-python: Released Build, PGO + LTO Optimizedluxcorerender: LuxCore Benchmark - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUquantlib: Multi-Threadedopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUluxcorerender: DLSC - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenssl: RSA4096openssl: RSA4096tensorflow: CPU - 32 - ResNet-50pytorch: CPU - 32 - ResNet-50pytorch: CPU - 64 - ResNet-50pytorch: CPU - 16 - ResNet-50ospray-studio: 3 - 4K - 32 - Path Tracer - CPUblender: Pabellon Barcelona - CPU-Onlyospray-studio: 2 - 4K - 32 - Path Tracer - CPUospray-studio: 1 - 4K - 32 - Path Tracer - CPUvvenc: Bosphorus 4K - Fastercloverleaf: clover_bm64_shortcloverleaf: clover_bmtensorflow: CPU - 16 - ResNet-50radiance: SMP Parallelblender: Classroom - CPU-Onlybuild-linux-kernel: defconfiglulesh: compress-7zip: Decompression Ratingcompress-7zip: Compression Ratingamg: specfem3d: Homogeneous Halfspaceliquid-dsp: 192 - 256 - 512rodinia: OpenMP Leukocyteliquid-dsp: 128 - 256 - 512john-the-ripper: WPA PSKliquid-dsp: 64 - 256 - 512liquid-dsp: 32 - 256 - 512liquid-dsp: 192 - 256 - 57liquid-dsp: 192 - 256 - 32liquid-dsp: 128 - 256 - 57liquid-dsp: 128 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 64 - 256 - 32liquid-dsp: 32 - 256 - 32liquid-dsp: 32 - 256 - 57vvenc: Bosphorus 1080p - Fastrodinia: OpenMP LavaMDpytorch: CPU - 1 - ResNet-50specfem3d: Water-layered Halfspacespecfem3d: Layered Halfspaceluxcorerender: Rainbow Colors and Prism - CPUblender: Fishy Cat - CPU-Onlyuvg266: Bosphorus 4K - Slownamd: ATPase Simulation - 327,506 Atomsuvg266: Bosphorus 4K - Mediumvvenc: Bosphorus 1080p - Fasteraskap: tConvolve OpenMP - Degriddingaskap: tConvolve OpenMP - Griddingblender: BMW27 - CPU-Onlynpb: BT.Cembree: Pathtracer ISPC - Asian Dragon Objaskap: Hogbom Clean OpenMPincompact3d: input.i3d 193 Cells Per Directionlibxsmm: 64specfem3d: Mount St. Helensnpb: IS.Dlibxsmm: 32uvg266: Bosphorus 4K - Very Fastuvg266: Bosphorus 4K - Super Fastuvg266: Bosphorus 4K - Ultra Fastnpb: LU.Cnpb: EP.Cembree: Pathtracer ISPC - Crownembree: Pathtracer ISPC - Asian Dragonuvg266: Bosphorus 1080p - Slowuvg266: Bosphorus 1080p - Mediumnpb: FT.Crodinia: OpenMP CFD Solvermt-dgemm: Sustained Floating-Point Raterodinia: OpenMP Streamclusterincompact3d: input.i3d 129 Cells Per Directionnpb: CG.Cnpb: SP.Bnpb: MG.Cuvg266: Bosphorus 1080p - Very Fastuvg266: Bosphorus 1080p - Ultra Fastuvg266: Bosphorus 1080p - Super FastDefault - DisabledSNC2SNC438.128183161.1076215310.3806.3455.39618068504.59490.52457.066.366.352043.5135.0260.65916623329.61326.060.5041986807118.8628733109722.241460066758.86611871.28655.84462057000357127000756165000727662000264.0772965186670.264379269858.646157.35941442701447131629499893427239116533618523758608151757053975116942446532564.6150.970175.46816.0689217.91331.85586138.64581136.2915.9116.16111.37023.986121.86811.16103.9043532.940198.3248215163746.0517314590.467.9874304528.715174335200177743874.045811757.933359726.871252106410761719316939965.6749.41493.5796.9510.6518.99188.35112.39141.54338.74141.78338.0993.23514.22310771.18.495643.895.3817809.4325.261898.1436.571310.8615.2345.252120.230.62113350.1737.012592.590.8686958.1219.334962.138.075936.379.729866.4517.552730.603.8112547.911533067.149897.370.3539.2139.0638.684399946.77382643789015.24539.3210.9651.77119.19238.0231.45423294.84465520354693616628940009.942989762151420000029.38613142000006142639384933335225600005407200000552673333344955333334228133333302090000026461000001435333333173056666724.29826.95347.7019.26850590518.74187193732.7519.4729.920.2580333.2341.39720481.219018.315.25214445.97112.82871127.8510.66185861055.47.7248900324279.98555.666.3267.4769.02255135.248869.84108.0805129.171989.1998.33100728.495.57743.8794514.7032.6173870552079.13145525.9495501.50208.36207.97211.3637.803185070.8277208910.4505.3672.14814096444.12438.54417.305.385.312026.2130.6563.43815769355.04316.500.5131950507115.9029578079822.771388880059.85911209.28223.2552218700038923500010695600001015300000174.3542892214000.267374600659.249155.50943365732580131627463983431516189103619875476838153933583575113183479572583.6140.813170.82815.3088668.13310.65963135.10014136.2016.0315.36102.32225.480104.60110.65106.0143138.940552.4272254945758.0216928785.547.5441437368.278171734201447704668.275818020.663359880.551259107410801711316999468.3551.06245.0297.7610.8518.44188.07311.9467.17357.0367.22356.7249.80481.41313010.55.064737.075.5217366.2912.521914.8914.511652.9115.1945.222121.180.36166444.7239.362435.280.61134566.0019.085023.424.145783.019.639948.306.183875.392.0911414.891529451.749862.166.1340.4140.2740.094430046.97380153800914.30641.2111.8249.51114.28438.3324.57933106.24464092853571317321493339.742806803149913333330.89413028333336116699351700005236033335393633333551993333344939666674233566667301610000026322333331436433333174443333323.71426.90548.8319.21560723618.35944223231.8219.7629.470.2561233.1240.22214122.512083.715.32215684.06111.3601564.98410.48933061052.37.7562337374282.36553.362.9964.6565.34256378.229712.81106.2642128.009988.6898.2297435.245.58143.7669204.8842.4965933254560.03151244.4997068.01203.48204.49206.2337.163187197.6338190510.6624.1572.65014067400.96397.01388.423.684.072004.7127.2465.32815314396.53307.060.5111956509112.0830026714821.601341900061.23410522.57467.1452786400039068200011049000001056980000168.1532879555000.266375909475.172156.21946933378620131706698427432066419833627655000238171313871075123185804172599.9137.479170.54015.3987467.23302.69164133.7447136.2815.2415.61100.61628.739104.69310.58103.7735893.734358.5275458088753.6416179084.507.5194150008.186163259201997507583.685728479.093327393.351266107310791726217014467.3951.19244.8097.8510.9718.62185.50511.8965.40366.5965.32367.1250.56474.11317145.35.074721.815.5117402.7612.541911.6814.561647.0915.0045.082127.420.36164363.1941.082333.460.61134176.0219.065026.694.135797.289.619973.186.203863.882.0911421.251538165.349924.966.8839.2939.0739.804453546.88385233828414.01442.0010.9549.27117.4438.2424.08738146.30464912053682017737583339.668759733149300000031.12313002333335961809272766675229966675416600000553550000045181333334243600000302316666726434333331440933333172126666723.45226.97148.3618.82173464218.04419229534.7219.7129.300.2550232.7540.2037607.316602.2815.31215764.82104.8031325.39410.36132431059.17.5451761784155.68559.861.2662.2563.23259883.6210411.60106.4294122.587188.1897.6094921.275.58543.6309784.9332.4161430256924.56151839.3098739.50207.62208.97206.92OpenBenchmarking.org

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon NanotubeDefault - DisabledSNC2SNC4918273645SE +/- 0.34, N = 3SE +/- 0.23, N = 3SE +/- 0.25, N = 338.1337.8037.161. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

PETSc

Test: Streams

OpenBenchmarking.orgMB/s, More Is BetterPETSc 3.19Test: StreamsDefault - DisabledSNC2SNC440K80K120K160K200KSE +/- 64.39, N = 3SE +/- 708.72, N = 3SE +/- 451.28, N = 3183161.11185070.83187197.631. (CC) gcc options: -fPIC -O3 -O2 -lpthread -lpciaccess -lm

OpenVKL

Benchmark: vklBenchmarkCPU ISPC

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU ISPCDefault - DisabledSNC2SNC45001000150020002500SE +/- 0.88, N = 3SE +/- 2.73, N = 3SE +/- 5.29, N = 3215320891905MIN: 179 / MAX: 27831MIN: 178 / MAX: 27767MIN: 180 / MAX: 27886

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_bareDefault - DisabledSNC2SNC43691215SE +/- 0.38, N = 9SE +/- 0.07, N = 3SE +/- 0.04, N = 310.3810.4510.661. (CXX) g++ options: -O3

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lDefault - DisabledSNC2SNC4246810SE +/- 0.02, N = 3SE +/- 0.06, N = 3SE +/- 0.23, N = 96.345.364.15MIN: 5.68 / MAX: 6.64MIN: 3.75 / MAX: 5.85MIN: 1.09 / MAX: 6.41

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average LatencyDefault - DisabledSNC2SNC41632486480SE +/- 0.50, N = 12SE +/- 3.12, N = 12SE +/- 3.47, N = 1255.4072.1572.651. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read WriteDefault - DisabledSNC2SNC44K8K12K16K20KSE +/- 160.41, N = 12SE +/- 496.25, N = 12SE +/- 577.54, N = 121806814096140671. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

ClickHouse

100M Rows Hits Dataset, Third Run

OpenBenchmarking.orgQueries Per Minute, Geo Mean, More Is BetterClickHouse 22.12.3.5100M Rows Hits Dataset, Third RunDefault - DisabledSNC2SNC4110220330440550SE +/- 5.43, N = 3SE +/- 8.91, N = 12SE +/- 10.68, N = 12504.59444.12400.96MIN: 58.03 / MAX: 3750MIN: 48.62 / MAX: 7500MIN: 41.1 / MAX: 6000

ClickHouse

100M Rows Hits Dataset, Second Run

OpenBenchmarking.orgQueries Per Minute, Geo Mean, More Is BetterClickHouse 22.12.3.5100M Rows Hits Dataset, Second RunDefault - DisabledSNC2SNC4110220330440550SE +/- 6.17, N = 3SE +/- 8.80, N = 12SE +/- 10.89, N = 12490.52438.54397.01MIN: 34.27 / MAX: 4615.38MIN: 35.21 / MAX: 6666.67MIN: 39.66 / MAX: 6000

ClickHouse

100M Rows Hits Dataset, First Run / Cold Cache

OpenBenchmarking.orgQueries Per Minute, Geo Mean, More Is BetterClickHouse 22.12.3.5100M Rows Hits Dataset, First Run / Cold CacheDefault - DisabledSNC2SNC4100200300400500SE +/- 4.35, N = 3SE +/- 8.46, N = 12SE +/- 10.37, N = 12457.06417.30388.42MIN: 47.36 / MAX: 4285.71MIN: 32.68 / MAX: 6000MIN: 30.26 / MAX: 6000

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_lDefault - DisabledSNC2SNC4246810SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.13, N = 66.365.383.68MIN: 5.69 / MAX: 6.62MIN: 3.75 / MAX: 5.87MIN: 1.19 / MAX: 6.36

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lDefault - DisabledSNC2SNC4246810SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.29, N = 66.355.314.07MIN: 5.72 / MAX: 6.64MIN: 3.81 / MAX: 5.85MIN: 1.11 / MAX: 6.41

libxsmm

M N K: 128

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128Default - DisabledSNC2SNC4400800120016002000SE +/- 2.22, N = 3SE +/- 0.82, N = 3SE +/- 11.98, N = 32043.52026.22004.71. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: ResNet-50Default - DisabledSNC2SNC4306090120150SE +/- 0.04, N = 3SE +/- 0.21, N = 3SE +/- 0.33, N = 3135.02130.65127.24

PostgreSQL

Scaling Factor: 1000 - Clients: 1000 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 1000 - Clients: 1000 - Mode: Read Write - Average LatencyDefault - DisabledSNC2SNC41530456075SE +/- 1.61, N = 12SE +/- 0.82, N = 3SE +/- 0.79, N = 460.6663.4465.331. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 1000 - Clients: 1000 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 1000 - Clients: 1000 - Mode: Read WriteDefault - DisabledSNC2SNC44K8K12K16K20KSE +/- 472.23, N = 12SE +/- 204.24, N = 3SE +/- 181.92, N = 41662315769153141. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

CloverLeaf

Input: clover_bm16

OpenBenchmarking.orgSeconds, Fewer Is BetterCloverLeaf 1.3Input: clover_bm16Default - DisabledSNC2SNC490180270360450SE +/- 0.23, N = 3SE +/- 1.55, N = 3SE +/- 1.41, N = 3329.61355.04396.531. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp

Quantum ESPRESSO

Input: AUSURF112

OpenBenchmarking.orgSeconds, Fewer Is BetterQuantum ESPRESSO 7.0Input: AUSURF112Default - DisabledSNC2SNC470140210280350SE +/- 0.35, N = 3SE +/- 0.87, N = 3SE +/- 0.32, N = 3326.06316.50307.061. (F9X) gfortran options: -pthread -fopenmp -ldevXlib -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3_omp -lfftw3 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

PostgreSQL

Scaling Factor: 1000 - Clients: 1000 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 1000 - Clients: 1000 - Mode: Read Only - Average LatencyDefault - DisabledSNC2SNC40.11540.23080.34620.46160.577SE +/- 0.002, N = 3SE +/- 0.005, N = 6SE +/- 0.004, N = 30.5040.5130.5111. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 1000 - Clients: 1000 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 1000 - Clients: 1000 - Mode: Read OnlyDefault - DisabledSNC2SNC4400K800K1200K1600K2000KSE +/- 7755.83, N = 3SE +/- 19687.71, N = 6SE +/- 14067.03, N = 31986807195050719565091. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: ResNet-50Default - DisabledSNC2SNC4306090120150SE +/- 0.12, N = 3SE +/- 0.23, N = 3SE +/- 0.12, N = 3118.86115.90112.08

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 15Total TimeDefault - DisabledSNC2SNC460M120M180M240M300MSE +/- 2377868.67, N = 3SE +/- 3381815.04, N = 15SE +/- 6406046.35, N = 152873310972957807983002671481. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver

LuxCoreRender

Scene: Orange Juice - Acceleration: CPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: CPUDefault - DisabledSNC2SNC4510152025SE +/- 0.26, N = 15SE +/- 0.29, N = 15SE +/- 0.06, N = 322.2422.7721.60MIN: 18.56 / MAX: 28.74MIN: 18.36 / MAX: 29.05MIN: 18.58 / MAX: 28.17

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: MD5Default - DisabledSNC2SNC43M6M9M12M15MSE +/- 43978.53, N = 3SE +/- 193319.26, N = 15SE +/- 174193.35, N = 151460066713888800134190001. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt -lbz2

Rodinia

Test: OpenMP HotSpot3D

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP HotSpot3DDefault - DisabledSNC2SNC41428425670SE +/- 0.82, N = 3SE +/- 0.62, N = 15SE +/- 0.53, N = 1558.8759.8661.231. (CXX) g++ options: -O2 -lOpenCL

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - DegriddingDefault - DisabledSNC2SNC43K6K9K12K15KSE +/- 75.70, N = 3SE +/- 13.79, N = 3SE +/- 138.85, N = 311871.211209.210522.51. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - GriddingDefault - DisabledSNC2SNC42K4K6K8K10KSE +/- 13.20, N = 3SE +/- 5.09, N = 3SE +/- 78.17, N = 38655.848223.257467.141. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Graph500

Scale: 26

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 26Default - DisabledSNC2SNC4110M220M330M440M550M4620570005221870005278640001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 26Default - DisabledSNC2SNC480M160M240M320M400M3571270003892350003906820001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 26Default - DisabledSNC2SNC4200M400M600M800M1000M756165000106956000011049000001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 26Default - DisabledSNC2SNC4200M400M600M800M1000M727662000101530000010569800001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Timed Linux Kernel Compilation

Build: allmodconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: allmodconfigDefault - DisabledSNC2SNC460120180240300SE +/- 1.24, N = 3SE +/- 0.85, N = 3SE +/- 0.44, N = 3264.08174.35168.15

John The Ripper

Test: HMAC-SHA512

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: HMAC-SHA512Default - DisabledSNC2SNC460M120M180M240M300MSE +/- 1312178.38, N = 3SE +/- 1930805.92, N = 15SE +/- 2535178.67, N = 122965186672892214002879555001. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt -lbz2

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average LatencyDefault - DisabledSNC2SNC40.06010.12020.18030.24040.3005SE +/- 0.003, N = 3SE +/- 0.002, N = 3SE +/- 0.003, N = 60.2640.2670.2661. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read OnlyDefault - DisabledSNC2SNC4800K1600K2400K3200K4000KSE +/- 49179.25, N = 3SE +/- 21467.51, N = 3SE +/- 37681.71, N = 63792698374600637590941. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

easyWave

Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400Default - DisabledSNC2SNC420406080100SE +/- 0.47, N = 9SE +/- 0.49, N = 3SE +/- 3.38, N = 1258.6559.2575.171. (CXX) g++ options: -O3 -fopenmp

OpenRadioss

Model: Chrysler Neon 1M

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Chrysler Neon 1MDefault - DisabledSNC2SNC4306090120150SE +/- 0.18, N = 3SE +/- 0.28, N = 3SE +/- 0.27, N = 3157.35155.50156.21

OpenSSL

Algorithm: AES-128-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMDefault - DisabledSNC2SNC4200000M400000M600000M800000M1000000MSE +/- 2155035998.71, N = 3SE +/- 1081587572.26, N = 3SE +/- 179395215.71, N = 39414427014479433657325809469333786201. (CC) gcc options: -pthread -m64 -O3 -ldl

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256Default - DisabledSNC2SNC430000M60000M90000M120000M150000MSE +/- 311439878.18, N = 3SE +/- 241913832.01, N = 3SE +/- 282602364.23, N = 31316294998931316274639831317066984271. (CC) gcc options: -pthread -m64 -O3 -ldl

OpenSSL

Algorithm: SHA512

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512Default - DisabledSNC2SNC49000M18000M27000M36000M45000MSE +/- 32408039.38, N = 3SE +/- 19206282.00, N = 3SE +/- 16399997.56, N = 34272391165343151618910432066419831. (CC) gcc options: -pthread -m64 -O3 -ldl

OpenSSL

Algorithm: ChaCha20-Poly1305

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305Default - DisabledSNC2SNC480000M160000M240000M320000M400000MSE +/- 176665463.84, N = 3SE +/- 140024932.44, N = 3SE +/- 129282934.95, N = 33618523758603619875476833627655000231. (CC) gcc options: -pthread -m64 -O3 -ldl

OpenSSL

Algorithm: AES-256-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMDefault - DisabledSNC2SNC4200000M400000M600000M800000M1000000MSE +/- 1400557991.78, N = 3SE +/- 991457061.79, N = 3SE +/- 1023563571.48, N = 38151757053978153933583578171313871071. (CC) gcc options: -pthread -m64 -O3 -ldl

OpenSSL

Algorithm: ChaCha20

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20Default - DisabledSNC2SNC4110000M220000M330000M440000M550000MSE +/- 157967680.92, N = 3SE +/- 45048908.57, N = 3SE +/- 56755200.60, N = 35116942446535113183479575123185804171. (CC) gcc options: -pthread -m64 -O3 -ldl

libxsmm

M N K: 256

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256Default - DisabledSNC2SNC46001200180024003000SE +/- 18.22, N = 3SE +/- 29.68, N = 3SE +/- 21.00, N = 32564.62583.62599.91. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Timed Gem5 Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 23.0.1Time To CompileDefault - DisabledSNC2SNC4306090120150SE +/- 1.82, N = 3SE +/- 1.48, N = 3SE +/- 1.46, N = 5150.97140.81137.48

Timed LLVM Compilation

Build System: Unix Makefiles

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesDefault - DisabledSNC2SNC44080120160200SE +/- 0.35, N = 3SE +/- 0.79, N = 3SE +/- 0.32, N = 3175.47170.83170.54

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152Default - DisabledSNC2SNC448121620SE +/- 0.16, N = 5SE +/- 0.14, N = 3SE +/- 0.17, N = 316.0615.3015.39MIN: 15.37 / MAX: 16.74MIN: 8.79 / MAX: 15.89MIN: 8.86 / MAX: 16.02

NAS Parallel Benchmarks

Test / Class: SP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CDefault - DisabledSNC2SNC420K40K60K80K100KSE +/- 61.36, N = 3SE +/- 271.76, N = 3SE +/- 287.86, N = 389217.9188668.1387467.231. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution TimeDefault - DisabledSNC2SNC470140210280350331.86310.66302.691. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Mesh TimeDefault - DisabledSNC2SNC4306090120150138.65135.10133.741. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Barbershop - Compute: CPU-OnlyDefault - DisabledSNC2SNC4306090120150SE +/- 0.18, N = 3SE +/- 0.26, N = 3SE +/- 0.23, N = 3136.29136.20136.28

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: ResNet-152Default - DisabledSNC2SNC448121620SE +/- 0.11, N = 3SE +/- 0.10, N = 3SE +/- 0.06, N = 315.9116.0315.24MIN: 15.34 / MAX: 16.26MIN: 9.32 / MAX: 16.38MIN: 8.23 / MAX: 15.81

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: ResNet-152Default - DisabledSNC2SNC448121620SE +/- 0.10, N = 3SE +/- 0.19, N = 3SE +/- 0.13, N = 316.1615.3615.61MIN: 15.77 / MAX: 16.51MIN: 8.91 / MAX: 15.79MIN: 8.77 / MAX: 16.18

Timed Node.js Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 19.8.1Time To CompileDefault - DisabledSNC2SNC420406080100SE +/- 1.21, N = 5SE +/- 0.40, N = 3SE +/- 0.39, N = 3111.37102.32100.62

easyWave

Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200Default - DisabledSNC2SNC4714212835SE +/- 0.32, N = 15SE +/- 0.46, N = 15SE +/- 0.57, N = 1223.9925.4828.741. (CXX) g++ options: -O3 -fopenmp

Timed LLVM Compilation

Build System: Ninja

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaDefault - DisabledSNC2SNC4306090120150SE +/- 0.08, N = 3SE +/- 0.30, N = 3SE +/- 0.52, N = 3121.87104.60104.69

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lDefault - DisabledSNC2SNC43691215SE +/- 0.08, N = 3SE +/- 0.10, N = 3SE +/- 0.04, N = 311.1610.6510.58MIN: 10.83 / MAX: 11.47MIN: 6.15 / MAX: 11.04MIN: 5.94 / MAX: 11.15

QMCPACK

Input: Li2_STO_ae

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.17.1Input: Li2_STO_aeDefault - DisabledSNC2SNC420406080100SE +/- 0.55, N = 3SE +/- 1.13, N = 3SE +/- 0.06, N = 3103.90106.01103.771. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl

ASKAP

Test: tConvolve MPI - Gridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingDefault - DisabledSNC2SNC49K18K27K36K45KSE +/- 199.70, N = 3SE +/- 341.22, N = 3SE +/- 743.57, N = 1543532.943138.935893.71. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MPI - Degridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingDefault - DisabledSNC2SNC49K18K27K36K45KSE +/- 170.33, N = 3SE +/- 463.37, N = 3SE +/- 691.76, N = 1540198.340552.434358.51. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

asmFish

1024 Hash Memory, 26 Depth

OpenBenchmarking.orgNodes/second, More Is BetterasmFish 2018-07-231024 Hash Memory, 26 DepthDefault - DisabledSNC2SNC460M120M180M240M300MSE +/- 1898256.07, N = 3SE +/- 2257424.10, N = 3SE +/- 1785028.61, N = 3248215163272254945275458088

Numpy Benchmark

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkDefault - DisabledSNC2SNC4160320480640800SE +/- 5.73, N = 3SE +/- 1.99, N = 3SE +/- 7.89, N = 3746.05758.02753.64

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: BlowfishDefault - DisabledSNC2SNC440K80K120K160K200KSE +/- 66.40, N = 3SE +/- 1828.86, N = 12SE +/- 1944.34, N = 151731451692871617901. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt -lbz2

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: ResNet-50Default - DisabledSNC2SNC420406080100SE +/- 0.05, N = 3SE +/- 0.22, N = 3SE +/- 0.09, N = 390.4685.5484.50

SPECFEM3D

Model: Tomographic Model

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Tomographic ModelDefault - DisabledSNC2SNC4246810SE +/- 0.088577121, N = 3SE +/- 0.094423144, N = 3SE +/- 0.068276455, N = 37.9874304527.5441437367.5194150001. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

VVenC

Video Input: Bosphorus 4K - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 4K - Video Preset: FastDefault - DisabledSNC2SNC4246810SE +/- 0.094, N = 3SE +/- 0.051, N = 3SE +/- 0.118, N = 38.7158.2788.1861. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

John The Ripper

Test: bcrypt

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: bcryptDefault - DisabledSNC2SNC440K80K120K160K200KSE +/- 1219.45, N = 3SE +/- 2018.22, N = 4SE +/- 1540.98, N = 151743351717341632591. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt -lbz2

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC44K8K12K16K20KSE +/- 42.71, N = 3SE +/- 18.75, N = 3SE +/- 57.00, N = 3200172014420199

Memcached

Set To Get Ratio: 1:100

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.19Set To Get Ratio: 1:100Default - DisabledSNC2SNC41.7M3.4M5.1M6.8M8.5MSE +/- 23469.53, N = 3SE +/- 52363.50, N = 3SE +/- 35640.28, N = 37743874.047704668.277507583.681. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Memcached

Set To Get Ratio: 1:10

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.19Set To Get Ratio: 1:10Default - DisabledSNC2SNC41.2M2.4M3.6M4.8M6MSE +/- 23305.65, N = 3SE +/- 17431.43, N = 3SE +/- 8577.07, N = 35811757.935818020.665728479.091. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Memcached

Set To Get Ratio: 1:5

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.19Set To Get Ratio: 1:5Default - DisabledSNC2SNC4700K1400K2100K2800K3500KSE +/- 15559.68, N = 3SE +/- 33459.28, N = 3SE +/- 44466.09, N = 33359726.873359880.553327393.351. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC430060090012001500SE +/- 1.53, N = 3SE +/- 2.40, N = 3SE +/- 4.33, N = 3125212591266

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC42004006008001000SE +/- 4.18, N = 3SE +/- 2.73, N = 3SE +/- 1.86, N = 3106410741073

OSPRay Studio

Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC42004006008001000SE +/- 3.79, N = 3SE +/- 3.61, N = 3SE +/- 3.71, N = 3107610801079

OSPRay Studio

Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC44K8K12K16K20KSE +/- 94.37, N = 3SE +/- 54.03, N = 3SE +/- 70.54, N = 3171931711317262

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC44K8K12K16K20KSE +/- 35.36, N = 3SE +/- 20.23, N = 3SE +/- 2.67, N = 3169391699917014

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Face Detection FP16 - Device: CPUDefault - DisabledSNC2SNC42004006008001000SE +/- 0.53, N = 3SE +/- 0.78, N = 3SE +/- 1.50, N = 3965.67468.35467.39MIN: 767.19 / MAX: 1026.9MIN: 396.69 / MAX: 533.02MIN: 391.77 / MAX: 504.291. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Face Detection FP16 - Device: CPUDefault - DisabledSNC2SNC41224364860SE +/- 0.02, N = 3SE +/- 0.09, N = 3SE +/- 0.15, N = 349.4151.0651.191. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Face Detection FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC4110220330440550SE +/- 0.76, N = 3SE +/- 0.27, N = 3SE +/- 0.34, N = 3493.57245.02244.80MIN: 246.03 / MAX: 522.95MIN: 201.96 / MAX: 285.01MIN: 212 / MAX: 263.671. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Face Detection FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC420406080100SE +/- 0.15, N = 3SE +/- 0.12, N = 3SE +/- 0.15, N = 396.9597.7697.851. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

LuxCoreRender

Scene: Danish Mood - Acceleration: CPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Danish Mood - Acceleration: CPUDefault - DisabledSNC2SNC43691215SE +/- 0.08, N = 3SE +/- 0.12, N = 3SE +/- 0.11, N = 310.6510.8510.97MIN: 4.82 / MAX: 12.11MIN: 5.02 / MAX: 12.4MIN: 5.05 / MAX: 12.72

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152Default - DisabledSNC2SNC4510152025SE +/- 0.25, N = 3SE +/- 0.21, N = 3SE +/- 0.20, N = 318.9918.4418.62MIN: 17.88 / MAX: 19.86MIN: 9.73 / MAX: 19.35MIN: 10.5 / MAX: 19.64

Timed CPython Compilation

Build Configuration: Released Build, PGO + LTO Optimized

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed CPython Compilation 3.10.6Build Configuration: Released Build, PGO + LTO OptimizedDefault - DisabledSNC2SNC44080120160200188.35188.07185.51

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: CPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: CPUDefault - DisabledSNC2SNC43691215SE +/- 0.11, N = 3SE +/- 0.09, N = 3SE +/- 0.15, N = 312.3911.9411.89MIN: 5.87 / MAX: 14.11MIN: 5.67 / MAX: 13.6MIN: 5.42 / MAX: 13.74

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Person Detection FP32 - Device: CPUDefault - DisabledSNC2SNC4306090120150SE +/- 0.37, N = 3SE +/- 0.43, N = 3SE +/- 0.17, N = 3141.5467.1765.40MIN: 50.69 / MAX: 212.29MIN: 38.08 / MAX: 93.26MIN: 44.03 / MAX: 91.211. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Person Detection FP32 - Device: CPUDefault - DisabledSNC2SNC480160240320400SE +/- 0.89, N = 3SE +/- 2.29, N = 3SE +/- 0.96, N = 3338.74357.03366.591. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Person Detection FP16 - Device: CPUDefault - DisabledSNC2SNC4306090120150SE +/- 0.52, N = 3SE +/- 0.29, N = 3SE +/- 0.13, N = 3141.7867.2265.32MIN: 54.39 / MAX: 210.64MIN: 37.86 / MAX: 90.7MIN: 36.8 / MAX: 87.751. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Person Detection FP16 - Device: CPUDefault - DisabledSNC2SNC480160240320400SE +/- 1.27, N = 3SE +/- 1.55, N = 3SE +/- 0.76, N = 3338.09356.72367.121. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Machine Translation EN To DE FP16 - Device: CPUDefault - DisabledSNC2SNC420406080100SE +/- 0.07, N = 3SE +/- 0.22, N = 3SE +/- 0.15, N = 393.2349.8050.56MIN: 42.51 / MAX: 145.87MIN: 38.67 / MAX: 117.73MIN: 38.52 / MAX: 103.271. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Machine Translation EN To DE FP16 - Device: CPUDefault - DisabledSNC2SNC4110220330440550SE +/- 0.33, N = 3SE +/- 2.07, N = 3SE +/- 1.37, N = 3514.22481.41474.111. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

QuantLib

Configuration: Multi-Threaded

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Multi-ThreadedDefault - DisabledSNC2SNC470K140K210K280K350KSE +/- 716.12, N = 3SE +/- 1378.05, N = 3SE +/- 3981.12, N = 3310771.1313010.5317145.31. (CXX) g++ options: -O3 -march=native -fPIE -pie

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUDefault - DisabledSNC2SNC4246810SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 38.495.065.07MIN: 5.66 / MAX: 25.99MIN: 4.47 / MAX: 12.75MIN: 4.35 / MAX: 13.851. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUDefault - DisabledSNC2SNC412002400360048006000SE +/- 12.20, N = 3SE +/- 6.31, N = 3SE +/- 10.32, N = 35643.894737.074721.811. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Face Detection Retail FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC41.2422.4843.7264.9686.21SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 35.385.525.51MIN: 3.31 / MAX: 23.31MIN: 4.91 / MAX: 14.41MIN: 4.84 / MAX: 15.481. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Face Detection Retail FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC44K8K12K16K20KSE +/- 14.08, N = 3SE +/- 14.02, N = 3SE +/- 11.11, N = 317809.4317366.2917402.761. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Road Segmentation ADAS FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC4612182430SE +/- 0.05, N = 3SE +/- 0.05, N = 3SE +/- 0.05, N = 325.2612.5212.54MIN: 12.5 / MAX: 49.19MIN: 10.66 / MAX: 27.97MIN: 10.7 / MAX: 28.811. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Road Segmentation ADAS FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC4400800120016002000SE +/- 4.08, N = 3SE +/- 7.84, N = 3SE +/- 7.31, N = 31898.141914.891911.681. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Road Segmentation ADAS FP16 - Device: CPUDefault - DisabledSNC2SNC4816243240SE +/- 0.36, N = 3SE +/- 0.04, N = 3SE +/- 0.01, N = 336.5714.5114.56MIN: 15.82 / MAX: 76.01MIN: 12.17 / MAX: 35.53MIN: 11.83 / MAX: 32.541. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Road Segmentation ADAS FP16 - Device: CPUDefault - DisabledSNC2SNC4400800120016002000SE +/- 12.81, N = 3SE +/- 4.22, N = 3SE +/- 0.95, N = 31310.861652.911647.091. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

LuxCoreRender

Scene: DLSC - Acceleration: CPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: CPUDefault - DisabledSNC2SNC448121620SE +/- 0.02, N = 3SE +/- 0.07, N = 3SE +/- 0.02, N = 315.2315.1915.00MIN: 14.85 / MAX: 18.76MIN: 14.66 / MAX: 18.88MIN: 14.61 / MAX: 18.76

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Handwritten English Recognition FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC41020304050SE +/- 0.07, N = 3SE +/- 0.11, N = 3SE +/- 0.16, N = 345.2545.2245.08MIN: 34.3 / MAX: 61.3MIN: 37.77 / MAX: 56.52MIN: 36.6 / MAX: 52.281. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Handwritten English Recognition FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC45001000150020002500SE +/- 3.37, N = 3SE +/- 5.01, N = 3SE +/- 7.69, N = 32120.232121.182127.421. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC40.13950.2790.41850.5580.6975SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.620.360.36MIN: 0.21 / MAX: 39.81MIN: 0.27 / MAX: 31.55MIN: 0.27 / MAX: 41.091. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC440K80K120K160K200KSE +/- 251.60, N = 3SE +/- 805.14, N = 3SE +/- 183.36, N = 3113350.17166444.72164363.191. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Handwritten English Recognition FP16 - Device: CPUDefault - DisabledSNC2SNC4918273645SE +/- 0.27, N = 3SE +/- 0.40, N = 3SE +/- 0.17, N = 337.0139.3641.08MIN: 20.68 / MAX: 66.67MIN: 32.78 / MAX: 55.31MIN: 31.72 / MAX: 55.311. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Handwritten English Recognition FP16 - Device: CPUDefault - DisabledSNC2SNC46001200180024003000SE +/- 19.25, N = 3SE +/- 24.21, N = 3SE +/- 9.95, N = 32592.592435.282333.461. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUDefault - DisabledSNC2SNC40.19350.3870.58050.7740.9675SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.860.610.61MIN: 0.28 / MAX: 17.81MIN: 0.44 / MAX: 16.14MIN: 0.46 / MAX: 16.981. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUDefault - DisabledSNC2SNC430K60K90K120K150KSE +/- 195.22, N = 3SE +/- 365.77, N = 3SE +/- 140.85, N = 386958.12134566.00134176.021. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Weld Porosity Detection FP16 - Device: CPUDefault - DisabledSNC2SNC4510152025SE +/- 0.03, N = 3SE +/- 0.04, N = 3SE +/- 0.03, N = 319.3319.0819.06MIN: 9.33 / MAX: 85.74MIN: 15.77 / MAX: 56.75MIN: 17.12 / MAX: 40.941. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Weld Porosity Detection FP16 - Device: CPUDefault - DisabledSNC2SNC411002200330044005500SE +/- 6.69, N = 3SE +/- 9.79, N = 3SE +/- 6.67, N = 34962.135023.425026.691. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC4246810SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 38.074.144.13MIN: 4.25 / MAX: 26.67MIN: 3.73 / MAX: 12.01MIN: 3.67 / MAX: 12.511. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC413002600390052006500SE +/- 10.15, N = 3SE +/- 14.13, N = 3SE +/- 15.69, N = 35936.375783.015797.281. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC43691215SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 39.729.639.61MIN: 4.95 / MAX: 29.09MIN: 8.28 / MAX: 16.71MIN: 8.18 / MAX: 19.441. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUDefault - DisabledSNC2SNC42K4K6K8K10KSE +/- 30.53, N = 3SE +/- 17.38, N = 3SE +/- 26.50, N = 39866.459948.309973.181. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Vehicle Detection FP16 - Device: CPUDefault - DisabledSNC2SNC448121620SE +/- 0.06, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 317.556.186.20MIN: 5.74 / MAX: 43.74MIN: 5.27 / MAX: 16.2MIN: 5.02 / MAX: 17.361. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Vehicle Detection FP16 - Device: CPUDefault - DisabledSNC2SNC48001600240032004000SE +/- 9.07, N = 3SE +/- 8.11, N = 3SE +/- 8.94, N = 32730.603875.393863.881. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Face Detection Retail FP16 - Device: CPUDefault - DisabledSNC2SNC40.85731.71462.57193.42924.2865SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 33.812.092.09MIN: 2.1 / MAX: 21.93MIN: 1.85 / MAX: 7.8MIN: 1.87 / MAX: 8.991. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Face Detection Retail FP16 - Device: CPUDefault - DisabledSNC2SNC43K6K9K12K15KSE +/- 73.24, N = 3SE +/- 56.50, N = 3SE +/- 61.71, N = 312547.9111414.8911421.251. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096Default - DisabledSNC2SNC4300K600K900K1200K1500KSE +/- 3444.37, N = 3SE +/- 486.05, N = 3SE +/- 3270.23, N = 31533067.11529451.71538165.31. (CC) gcc options: -pthread -m64 -O3 -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096Default - DisabledSNC2SNC411K22K33K44K55KSE +/- 49.04, N = 3SE +/- 67.46, N = 3SE +/- 90.07, N = 349897.349862.149924.91. (CC) gcc options: -pthread -m64 -O3 -ldl

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: ResNet-50Default - DisabledSNC2SNC41632486480SE +/- 0.29, N = 3SE +/- 0.19, N = 3SE +/- 0.74, N = 370.3566.1366.88

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: ResNet-50Default - DisabledSNC2SNC4918273645SE +/- 0.10, N = 3SE +/- 0.40, N = 3SE +/- 0.45, N = 339.2140.4139.29MIN: 36.81 / MAX: 40.31MIN: 34.93 / MAX: 42.02MIN: 25.78 / MAX: 41.34

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: ResNet-50Default - DisabledSNC2SNC4918273645SE +/- 0.05, N = 3SE +/- 0.20, N = 3SE +/- 0.12, N = 339.0640.2739.07MIN: 37.1 / MAX: 40.22MIN: 23.49 / MAX: 42.09MIN: 19.69 / MAX: 41.1

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50Default - DisabledSNC2SNC4918273645SE +/- 0.12, N = 3SE +/- 0.17, N = 3SE +/- 0.17, N = 338.6840.0939.80MIN: 37.08 / MAX: 39.76MIN: 22.35 / MAX: 41.64MIN: 26.68 / MAX: 41.28

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC410K20K30K40K50KSE +/- 70.44, N = 3SE +/- 80.70, N = 3SE +/- 146.66, N = 3439994430044535

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Pabellon Barcelona - Compute: CPU-OnlyDefault - DisabledSNC2SNC41122334455SE +/- 0.20, N = 3SE +/- 0.23, N = 3SE +/- 0.29, N = 346.7746.9746.88

OSPRay Studio

Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC48K16K24K32K40KSE +/- 105.83, N = 3SE +/- 192.95, N = 3SE +/- 65.34, N = 3382643801538523

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.13Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUDefault - DisabledSNC2SNC48K16K24K32K40KSE +/- 51.31, N = 3SE +/- 106.88, N = 3SE +/- 154.47, N = 3378903800938284

VVenC

Video Input: Bosphorus 4K - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 4K - Video Preset: FasterDefault - DisabledSNC2SNC448121620SE +/- 0.02, N = 3SE +/- 0.12, N = 3SE +/- 0.09, N = 315.2514.3114.011. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

CloverLeaf

Input: clover_bm64_short

OpenBenchmarking.orgSeconds, Fewer Is BetterCloverLeaf 1.3Input: clover_bm64_shortDefault - DisabledSNC2SNC41020304050SE +/- 0.01, N = 3SE +/- 0.26, N = 3SE +/- 0.08, N = 339.3241.2142.001. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp

CloverLeaf

Input: clover_bm

OpenBenchmarking.orgSeconds, Fewer Is BetterCloverLeaf 1.3Input: clover_bmDefault - DisabledSNC2SNC43691215SE +/- 0.07, N = 14SE +/- 0.09, N = 15SE +/- 0.07, N = 310.9611.8210.951. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50Default - DisabledSNC2SNC41224364860SE +/- 0.23, N = 3SE +/- 0.59, N = 3SE +/- 0.63, N = 351.7749.5149.27

Radiance Benchmark

Test: SMP Parallel

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP ParallelDefault - DisabledSNC2SNC4306090120150119.19114.28117.44

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Classroom - Compute: CPU-OnlyDefault - DisabledSNC2SNC4918273645SE +/- 0.09, N = 3SE +/- 0.03, N = 3SE +/- 0.07, N = 338.0238.3338.24

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigDefault - DisabledSNC2SNC4714212835SE +/- 0.34, N = 4SE +/- 0.26, N = 5SE +/- 0.29, N = 431.4524.5824.09

LULESH

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3Default - DisabledSNC2SNC48K16K24K32K40KSE +/- 398.51, N = 12SE +/- 116.87, N = 3SE +/- 142.33, N = 323294.8433106.2438146.301. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression RatingDefault - DisabledSNC2SNC4140K280K420K560K700KSE +/- 364.51, N = 3SE +/- 11076.26, N = 3SE +/- 5454.96, N = 36552036409286491201. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression RatingDefault - DisabledSNC2SNC4120K240K360K480K600KSE +/- 410.41, N = 3SE +/- 2142.48, N = 3SE +/- 2055.13, N = 35469365357135368201. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Algebraic Multi-Grid Benchmark

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2Default - DisabledSNC2SNC4400M800M1200M1600M2000MSE +/- 1222576.38, N = 3SE +/- 5512706.85, N = 3SE +/- 14033805.07, N = 31662894000173214933317737583331. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

SPECFEM3D

Model: Homogeneous Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Homogeneous HalfspaceDefault - DisabledSNC2SNC43691215SE +/- 0.062415502, N = 15SE +/- 0.112412006, N = 4SE +/- 0.063880399, N = 39.9429897629.7428068039.6687597331. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Liquid-DSP

Threads: 192 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 512Default - DisabledSNC2SNC4300M600M900M1200M1500MSE +/- 4629254.80, N = 3SE +/- 5353295.97, N = 3SE +/- 5108163.40, N = 31514200000149913333314930000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Rodinia

Test: OpenMP Leukocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LeukocyteDefault - DisabledSNC2SNC4714212835SE +/- 0.23, N = 3SE +/- 0.14, N = 3SE +/- 0.05, N = 329.3930.8931.121. (CXX) g++ options: -O2 -lOpenCL

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 512Default - DisabledSNC2SNC4300M600M900M1200M1500MSE +/- 6222807.51, N = 3SE +/- 6948700.92, N = 3SE +/- 7198688.15, N = 31314200000130283333313002333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

John The Ripper

Test: WPA PSK

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: WPA PSKDefault - DisabledSNC2SNC4130K260K390K520K650KSE +/- 2146.03, N = 3SE +/- 4996.06, N = 3SE +/- 4063.87, N = 36142636116695961801. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt -lbz2

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512Default - DisabledSNC2SNC4200M400M600M800M1000MSE +/- 8999104.28, N = 3SE +/- 8992613.64, N = 3SE +/- 6276308.19, N = 39384933339351700009272766671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 512Default - DisabledSNC2SNC4110M220M330M440M550MSE +/- 2767116.43, N = 3SE +/- 3417076.40, N = 3SE +/- 3819181.12, N = 35225600005236033335229966671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 192 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 57Default - DisabledSNC2SNC41200M2400M3600M4800M6000MSE +/- 18999298.23, N = 3SE +/- 15429013.07, N = 3SE +/- 14304195.19, N = 35407200000539363333354166000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 192 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 32Default - DisabledSNC2SNC41200M2400M3600M4800M6000MSE +/- 19784955.00, N = 3SE +/- 18095333.96, N = 3SE +/- 29526993.30, N = 35526733333551993333355355000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 57Default - DisabledSNC2SNC41000M2000M3000M4000M5000MSE +/- 17623122.44, N = 3SE +/- 10235938.86, N = 3SE +/- 22778084.01, N = 34495533333449396666745181333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 32Default - DisabledSNC2SNC4900M1800M2700M3600M4500MSE +/- 18076719.22, N = 3SE +/- 26426018.32, N = 3SE +/- 26314254.69, N = 34228133333423356666742436000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57Default - DisabledSNC2SNC4600M1200M1800M2400M3000MSE +/- 20219380.14, N = 3SE +/- 14068522.78, N = 3SE +/- 1844210.16, N = 33020900000301610000030231666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32Default - DisabledSNC2SNC4600M1200M1800M2400M3000MSE +/- 26479677.74, N = 3SE +/- 22995168.57, N = 3SE +/- 26768908.17, N = 32646100000263223333326434333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 32Default - DisabledSNC2SNC4300M600M900M1200M1500MSE +/- 1937638.88, N = 3SE +/- 5024716.69, N = 3SE +/- 3769320.60, N = 31435333333143643333314409333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 57Default - DisabledSNC2SNC4400M800M1200M1600M2000MSE +/- 3985947.54, N = 3SE +/- 3811532.21, N = 3SE +/- 6406333.67, N = 31730566667174443333317212666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

VVenC

Video Input: Bosphorus 1080p - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 1080p - Video Preset: FastDefault - DisabledSNC2SNC4612182430SE +/- 0.17, N = 3SE +/- 0.29, N = 3SE +/- 0.24, N = 324.3023.7123.451. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

Rodinia

Test: OpenMP LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LavaMDDefault - DisabledSNC2SNC4612182430SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.09, N = 326.9526.9126.971. (CXX) g++ options: -O2 -lOpenCL

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50Default - DisabledSNC2SNC41122334455SE +/- 0.07, N = 3SE +/- 0.25, N = 3SE +/- 0.25, N = 347.7048.8348.36MIN: 44.82 / MAX: 49.01MIN: 28.77 / MAX: 50.91MIN: 25.09 / MAX: 50.72

SPECFEM3D

Model: Water-layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Water-layered HalfspaceDefault - DisabledSNC2SNC4510152025SE +/- 0.08, N = 3SE +/- 0.19, N = 3SE +/- 0.06, N = 319.2719.2218.821. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Layered HalfspaceDefault - DisabledSNC2SNC4510152025SE +/- 0.16, N = 3SE +/- 0.24, N = 3SE +/- 0.14, N = 318.7418.3618.041. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: CPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Rainbow Colors and Prism - Acceleration: CPUDefault - DisabledSNC2SNC4816243240SE +/- 1.27, N = 15SE +/- 1.17, N = 15SE +/- 0.24, N = 332.7531.8234.72MIN: 17.73 / MAX: 35.6MIN: 18.52 / MAX: 35.42MIN: 34.26 / MAX: 35.1

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Fishy Cat - Compute: CPU-OnlyDefault - DisabledSNC2SNC4510152025SE +/- 0.09, N = 3SE +/- 0.20, N = 3SE +/- 0.05, N = 319.4719.7619.71

uvg266

Video Input: Bosphorus 4K - Video Preset: Slow

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: SlowDefault - DisabledSNC2SNC4714212835SE +/- 0.09, N = 3SE +/- 0.09, N = 3SE +/- 0.13, N = 329.9229.4729.30

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 AtomsDefault - DisabledSNC2SNC40.05810.11620.17430.23240.2905SE +/- 0.00087, N = 3SE +/- 0.00284, N = 3SE +/- 0.00197, N = 30.258030.256120.25502

uvg266

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: MediumDefault - DisabledSNC2SNC4816243240SE +/- 0.12, N = 3SE +/- 0.17, N = 3SE +/- 0.08, N = 333.2333.1232.75

VVenC

Video Input: Bosphorus 1080p - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 1080p - Video Preset: FasterDefault - DisabledSNC2SNC4918273645SE +/- 0.17, N = 3SE +/- 0.32, N = 3SE +/- 0.45, N = 341.4040.2240.201. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

ASKAP

Test: tConvolve OpenMP - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - DegriddingDefault - DisabledSNC2SNC44K8K12K16K20KSE +/- 100.74, N = 15SE +/- 0.00, N = 320481.2014122.507607.311. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve OpenMP - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - GriddingDefault - DisabledSNC2SNC44K8K12K16K20KSE +/- 0.00, N = 3SE +/- 122.89, N = 15SE +/- 54.12, N = 319018.3012083.706602.281. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: BMW27 - Compute: CPU-OnlyDefault - DisabledSNC2SNC448121620SE +/- 0.04, N = 3SE +/- 0.06, N = 3SE +/- 0.04, N = 315.2515.3215.31

NAS Parallel Benchmarks

Test / Class: BT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.CDefault - DisabledSNC2SNC450K100K150K200K250KSE +/- 280.16, N = 3SE +/- 322.09, N = 3SE +/- 112.43, N = 3214445.97215684.06215764.821. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon Obj

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian Dragon ObjDefault - DisabledSNC2SNC4306090120150SE +/- 0.59, N = 3SE +/- 0.58, N = 3SE +/- 0.54, N = 3112.83111.36104.80MIN: 110.06 / MAX: 116.96MIN: 108.45 / MAX: 114.81MIN: 100.15 / MAX: 112.2

ASKAP

Test: Hogbom Clean OpenMP

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMPDefault - DisabledSNC2SNC42004006008001000SE +/- 4.25, N = 3SE +/- 1.84, N = 3SE +/- 1.54, N = 31127.85564.98325.391. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionDefault - DisabledSNC2SNC43691215SE +/- 0.03, N = 3SE +/- 0.05, N = 3SE +/- 0.08, N = 310.6610.4910.361. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

libxsmm

M N K: 64

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64Default - DisabledSNC2SNC42004006008001000SE +/- 0.53, N = 3SE +/- 2.33, N = 3SE +/- 1.09, N = 31055.41052.31059.11. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

SPECFEM3D

Model: Mount St. Helens

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Mount St. HelensDefault - DisabledSNC2SNC4246810SE +/- 0.077041220, N = 3SE +/- 0.022697239, N = 3SE +/- 0.073100963, N = 37.7248900327.7562337377.5451761781. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

NAS Parallel Benchmarks

Test / Class: IS.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.DDefault - DisabledSNC2SNC49001800270036004500SE +/- 21.28, N = 3SE +/- 8.86, N = 3SE +/- 45.26, N = 44279.984282.364155.681. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

libxsmm

M N K: 32

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32Default - DisabledSNC2SNC4120240360480600SE +/- 0.58, N = 3SE +/- 2.72, N = 3SE +/- 1.96, N = 3555.6553.3559.81. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

uvg266

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Very FastDefault - DisabledSNC2SNC41530456075SE +/- 0.38, N = 3SE +/- 0.60, N = 3SE +/- 0.22, N = 366.3262.9961.26

uvg266

Video Input: Bosphorus 4K - Video Preset: Super Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Super FastDefault - DisabledSNC2SNC41530456075SE +/- 0.24, N = 3SE +/- 0.12, N = 3SE +/- 0.42, N = 367.4764.6562.25

uvg266

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Ultra FastDefault - DisabledSNC2SNC41530456075SE +/- 0.15, N = 3SE +/- 0.36, N = 3SE +/- 0.25, N = 369.0265.3463.23

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.CDefault - DisabledSNC2SNC460K120K180K240K300KSE +/- 1442.62, N = 3SE +/- 427.36, N = 3SE +/- 1304.12, N = 3255135.24256378.22259883.621. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

NAS Parallel Benchmarks

Test / Class: EP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.CDefault - DisabledSNC2SNC42K4K6K8K10KSE +/- 60.80, N = 3SE +/- 257.16, N = 15SE +/- 304.30, N = 128869.849712.8110411.601. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

Embree

Binary: Pathtracer ISPC - Model: Crown

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: CrownDefault - DisabledSNC2SNC420406080100SE +/- 0.46, N = 3SE +/- 0.53, N = 3SE +/- 0.40, N = 3108.08106.26106.43MIN: 105.13 / MAX: 117.76MIN: 102.99 / MAX: 115.22MIN: 103.16 / MAX: 113.77

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian DragonDefault - DisabledSNC2SNC4306090120150SE +/- 0.63, N = 3SE +/- 0.56, N = 3SE +/- 1.43, N = 4129.17128.01122.59MIN: 126.31 / MAX: 136MIN: 125.3 / MAX: 131.84MIN: 115.06 / MAX: 129.69

uvg266

Video Input: Bosphorus 1080p - Video Preset: Slow

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: SlowDefault - DisabledSNC2SNC420406080100SE +/- 0.27, N = 3SE +/- 0.19, N = 3SE +/- 0.40, N = 389.1988.6888.18

uvg266

Video Input: Bosphorus 1080p - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: MediumDefault - DisabledSNC2SNC420406080100SE +/- 0.23, N = 3SE +/- 0.18, N = 3SE +/- 0.34, N = 398.3398.2297.60

NAS Parallel Benchmarks

Test / Class: FT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.CDefault - DisabledSNC2SNC420K40K60K80K100KSE +/- 245.52, N = 3SE +/- 925.13, N = 3SE +/- 107.70, N = 3100728.4997435.2494921.271. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

Rodinia

Test: OpenMP CFD Solver

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP CFD SolverDefault - DisabledSNC2SNC41.25662.51323.76985.02646.283SE +/- 0.036, N = 3SE +/- 0.002, N = 3SE +/- 0.013, N = 35.5775.5815.5851. (CXX) g++ options: -O2 -lOpenCL

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateDefault - DisabledSNC2SNC41020304050SE +/- 0.18, N = 3SE +/- 0.36, N = 3SE +/- 0.42, N = 343.8843.7743.631. (CC) gcc options: -O3 -march=native -fopenmp

Rodinia

Test: OpenMP Streamcluster

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP StreamclusterDefault - DisabledSNC2SNC41.10992.21983.32974.43965.5495SE +/- 0.007, N = 3SE +/- 0.019, N = 3SE +/- 0.027, N = 34.7034.8844.9331. (CXX) g++ options: -O2 -lOpenCL

Xcompact3d Incompact3d

Input: input.i3d 129 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per DirectionDefault - DisabledSNC2SNC40.58891.17781.76672.35562.9445SE +/- 0.03355279, N = 3SE +/- 0.01815766, N = 3SE +/- 0.00587555, N = 32.617387052.496593322.416143021. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

NAS Parallel Benchmarks

Test / Class: CG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.CDefault - DisabledSNC2SNC412K24K36K48K60KSE +/- 476.77, N = 3SE +/- 324.22, N = 3SE +/- 191.96, N = 352079.1354560.0356924.561. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

NAS Parallel Benchmarks

Test / Class: SP.B

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.BDefault - DisabledSNC2SNC430K60K90K120K150KSE +/- 676.26, N = 3SE +/- 674.53, N = 3SE +/- 1739.78, N = 3145525.94151244.49151839.301. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

NAS Parallel Benchmarks

Test / Class: MG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.CDefault - DisabledSNC2SNC420K40K60K80K100KSE +/- 1045.55, N = 4SE +/- 59.39, N = 3SE +/- 517.95, N = 395501.5097068.0198739.501. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5

uvg266

Video Input: Bosphorus 1080p - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Very FastDefault - DisabledSNC2SNC450100150200250SE +/- 1.32, N = 3SE +/- 0.63, N = 3SE +/- 1.28, N = 3208.36203.48207.62

uvg266

Video Input: Bosphorus 1080p - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Ultra FastDefault - DisabledSNC2SNC450100150200250SE +/- 0.58, N = 3SE +/- 0.83, N = 3SE +/- 2.40, N = 3207.97204.49208.97

uvg266

Video Input: Bosphorus 1080p - Video Preset: Super Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Super FastDefault - DisabledSNC2SNC450100150200250SE +/- 0.37, N = 3SE +/- 1.37, N = 3SE +/- 2.03, N = 3211.36206.23206.92


Phoronix Test Suite v10.8.5