AMD SME Benchmark Genoa

4th Gen AMD EPYC "Genoa" Secure Memory Encryption (SME) benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2212212-NE-AMDSMEBEN19.

AMD SME Benchmark GenoaProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionNo SMEAMD SME Enabled2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads)AMD Titanite_4G (RTI1002E BIOS)AMD Device 14a41520GB800GB INTEL SSDPF21Q800GBASPEEDVGA HDMIBroadcom NetXtreme BCM5720 PCIeUbuntu 22.106.1.0-phx (x86_64)GNOME Shell 43.0X Server 1.21.1.41.3.224GCC 12.2.0 + Clang 15.0.2-1ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10110d Java Details- OpenJDK Runtime Environment (build 11.0.17+8-post-Ubuntu-1ubuntu2)Python Details- Python 3.10.7Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AMD SME Benchmark Genoaquantlib: hpcg: npb: BT.Cnpb: EP.Cnpb: FT.Cnpb: SP.Cminibude: OpenMP - BM1minibude: OpenMP - BM1minibude: OpenMP - BM2minibude: OpenMP - BM2rodinia: OpenMP LavaMDrodinia: OpenMP CFD Solvernamd: ATPase Simulation - 327,506 Atomsnwchem: C240 Buckyballincompact3d: input.i3d 193 Cells Per Directionopenfoam: drivaerFastback, Small Mesh Size - Mesh Timeopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenradioss: Bumper Beamopenradioss: Cell Phone Drop Testopenradioss: INIVOL and Fluid Structure Interaction Drop Containerrelion: Basic - CPUlulesh: xmrig: Monero - 1Mxmrig: Wownero - 1Mdacapobench: H2renaissance: Finagle HTTP Requestsrenaissance: In-Memory Database Shootoutcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedsrsran: OFDM_Testsrsran: 4G PHY_DL_Test 100 PRB MIMO 64-QAMsrsran: 4G PHY_DL_Test 100 PRB MIMO 64-QAMsrsran: 4G PHY_DL_Test 100 PRB SISO 64-QAMsrsran: 4G PHY_DL_Test 100 PRB SISO 64-QAMsrsran: 4G PHY_DL_Test 100 PRB MIMO 256-QAMsrsran: 4G PHY_DL_Test 100 PRB MIMO 256-QAMsrsran: 4G PHY_DL_Test 100 PRB SISO 256-QAMsrsran: 4G PHY_DL_Test 100 PRB SISO 256-QAMsrsran: 5G PHY_DL_NR Test 52 PRB SISO 64-QAMsrsran: 5G PHY_DL_NR Test 52 PRB SISO 64-QAMaom-av1: Speed 10 Realtime - Bosphorus 4Kembree: Pathtracer ISPC - Crownkvazaar: Bosphorus 4K - Very Fastkvazaar: Bosphorus 4K - Ultra Fastsvt-av1: Preset 13 - Bosphorus 4Kx264: Bosphorus 4Kx265: Bosphorus 4Kmt-dgemm: Sustained Floating-Point Rateoidn: RTLightmap.hdr.4096x4096openvkl: vklBenchmark ISPCospray: particle_volume/pathtracer/real_timeospray: gravity_spheres_volume/dim_512/ao/real_timecompress-7zip: Compression Ratingcompress-7zip: Decompression Ratingavifenc: 2avifenc: 6build-gem5: Time To Compilebuild-godot: Time To Compilebuild-linux-kernel: defconfigbuild-linux-kernel: allmodconfigbuild-llvm: Ninjabuild-llvm: Unix Makefilesospray-studio: 3 - 4K - 32 - Path Tracerliquid-dsp: 256 - 256 - 57liquid-dsp: 384 - 256 - 57askap: tConvolve MPI - Degriddingaskap: tConvolve MPI - Griddingastcenc: Thoroughastcenc: Exhaustivegraph500: 26graph500: 26graph500: 26graph500: 26gromacs: MPI CPU - water_GMX50_barepgbench: 100 - 250 - Read Onlytensorflow: CPU - 64 - AlexNettoktx: Zstd Compression 9toktx: Zstd Compression 19deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamwrf: conus 2.5kmgpaw: Carbon Nanotubeblender: Classroom - CPU-Onlyblender: Barbershop - CPU-Onlyopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUxsbench: nginx: 500onnx: super-resolution-10 - CPU - Standardappleseed: Emilypyhpc: CPU - Numpy - 4194304 - Equation of Statepyhpc: CPU - Numpy - 4194304 - Isoneutral Mixingonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUNo SMEAMD SME Enabled3052.888.3902496467.9816457.94223096.07255564.197281.284291.2518633.544345.34216.5085.9380.128311524.44.3742052725.0678722.08426479.8518.4580.88128.65559069.405105141.7126508.3480712286.34764.652.93825.0161733333415.1157.8413.9165.8445.2166.0444.0172.7139.794.434.47183.252874.3177.68248.334106.8623.4870.3720951.661322230.61743.8478917782116063234.6902.393138.63934.14125.709146.32575.329160.12922043103320000001034600000083598.393071.0106.424411.82061426480000153318000059315300083850500018.7122951147508.402.73418.86384.25001134.4234762.6984125.5335858.4729111.46141962.102448.82911204.849179.4895617.0281155.102984.27641133.31324077.18923.16720.9980.77101.90469.8143.291102.1542.761115.567437.736.44193.95246.9511180.634.289997.684.79967.9049.5419801.409.629027.725.31150792.420.55165194.420.3629806415201056.695600142.947020.8841.6910.8637263.892990.52205222.67950.9161332011.153061.387.1501494917.4416462.35220214.75253299.337290.636291.6258588.749343.55016.6696.0430.129911543.14.4242456827.09945822.1330279.9718.3280.90130.42657686.086101932.1123484.1505012347.54838.549.93837.3162633333408.5157.7415.7165.9444.8165.7445.7172.2139.194.933.12180.571773.3276.22251.441103.0723.2970.2774371.661286229.87943.3785885135116903835.2602.469142.18135.03825.303148.43576.629162.62922614103446666671035000000078718.789541.8106.556611.83791358510000152638000057251000083546700018.6232970869505.262.77619.88183.63631143.0364745.6412128.4001840.9404113.82701911.399950.10801178.999081.2258601.9341158.942883.81731143.01064116.6222.98120.9581.57101.55471.5342.331127.5142.031134.687274.986.59193.83247.2311184.764.289990.194.79963.1449.7819704.849.678993.905.33148736.040.55167545.540.3629021428196386.415583150.920710.9341.7440.8504183.923130.52662823.14290.9184822002.43OpenBenchmarking.org

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21No SMEAMD SME Enabled7001400210028003500SE +/- 6.39, N = 3SE +/- 8.14, N = 33052.83061.31. (CXX) g++ options: -O3 -march=native -rdynamic

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1No SMEAMD SME Enabled20406080100SE +/- 0.10, N = 3SE +/- 0.01, N = 388.3987.151. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

NAS Parallel Benchmarks

Test / Class: BT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.CNo SMEAMD SME Enabled110K220K330K440K550KSE +/- 529.89, N = 3SE +/- 3984.99, N = 3496467.98494917.441. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4

NAS Parallel Benchmarks

Test / Class: EP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.CNo SMEAMD SME Enabled4K8K12K16K20KSE +/- 54.14, N = 3SE +/- 73.01, N = 316457.9416462.351. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4

NAS Parallel Benchmarks

Test / Class: FT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.CNo SMEAMD SME Enabled50K100K150K200K250KSE +/- 2651.33, N = 4SE +/- 1868.66, N = 3223096.07220214.751. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4

NAS Parallel Benchmarks

Test / Class: SP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CNo SMEAMD SME Enabled50K100K150K200K250KSE +/- 3645.86, N = 3SE +/- 2731.53, N = 3255564.19253299.331. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1No SMEAMD SME Enabled16003200480064008000SE +/- 16.32, N = 3SE +/- 8.37, N = 37281.287290.641. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1No SMEAMD SME Enabled60120180240300SE +/- 0.65, N = 3SE +/- 0.33, N = 3291.25291.631. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2No SMEAMD SME Enabled2K4K6K8K10KSE +/- 80.07, N = 3SE +/- 99.74, N = 38633.548588.751. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2No SMEAMD SME Enabled80160240320400SE +/- 3.20, N = 3SE +/- 3.99, N = 3345.34343.551. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

Rodinia

Test: OpenMP LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LavaMDNo SMEAMD SME Enabled48121620SE +/- 0.13, N = 3SE +/- 0.05, N = 316.5116.671. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenMP CFD Solver

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP CFD SolverNo SMEAMD SME Enabled246810SE +/- 0.012, N = 3SE +/- 0.030, N = 35.9386.0431. (CXX) g++ options: -O2 -lOpenCL

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 AtomsNo SMEAMD SME Enabled0.02920.05840.08760.11680.146SE +/- 0.00031, N = 3SE +/- 0.00010, N = 30.128310.12991

NWChem

Input: C240 Buckyball

OpenBenchmarking.orgSeconds, Fewer Is BetterNWChem 7.0.2Input: C240 BuckyballNo SMEAMD SME Enabled300600900120015001524.41543.11. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionNo SMEAMD SME Enabled0.99551.9912.98653.9824.9775SE +/- 0.01135008, N = 3SE +/- 0.04122391, N = 34.374205274.424245681. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh TimeNo SMEAMD SME Enabled61218243025.0727.101. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeNo SMEAMD SME Enabled51015202522.0822.131. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenRadioss

Model: Bumper Beam

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bumper BeamNo SMEAMD SME Enabled20406080100SE +/- 0.73, N = 3SE +/- 0.15, N = 379.8579.97

OpenRadioss

Model: Cell Phone Drop Test

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Cell Phone Drop TestNo SMEAMD SME Enabled510152025SE +/- 0.02, N = 3SE +/- 0.13, N = 318.4518.32

OpenRadioss

Model: INIVOL and Fluid Structure Interaction Drop Container

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: INIVOL and Fluid Structure Interaction Drop ContainerNo SMEAMD SME Enabled20406080100SE +/- 0.09, N = 3SE +/- 0.15, N = 380.8880.90

RELION

Test: Basic - Device: CPU

OpenBenchmarking.orgSeconds, Fewer Is BetterRELION 3.1.1Test: Basic - Device: CPUNo SMEAMD SME Enabled306090120150SE +/- 1.40, N = 5SE +/- 1.42, N = 5128.66130.431. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -lmpi_cxx -lmpi

LULESH

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3No SMEAMD SME Enabled13K26K39K52K65KSE +/- 197.53, N = 3SE +/- 360.17, N = 359069.4157686.091. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

Xmrig

Variant: Monero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Monero - Hash Count: 1MNo SMEAMD SME Enabled20K40K60K80K100KSE +/- 111.86, N = 3SE +/- 540.08, N = 3105141.7101932.11. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Xmrig

Variant: Wownero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Wownero - Hash Count: 1MNo SMEAMD SME Enabled30K60K90K120K150KSE +/- 211.86, N = 3SE +/- 341.44, N = 3126508.3123484.11. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

DaCapo Benchmark

Java Test: H2

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 9.12-MR1Java Test: H2No SMEAMD SME Enabled11002200330044005500SE +/- 54.36, N = 20SE +/- 50.10, N = 2048075050

Renaissance

Test: Finagle HTTP Requests

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.14Test: Finagle HTTP RequestsNo SMEAMD SME Enabled3K6K9K12K15KSE +/- 88.33, N = 3SE +/- 95.54, N = 312286.312347.5MIN: 11326.41 / MAX: 12632.65MIN: 11146.33 / MAX: 12514.13

Renaissance

Test: In-Memory Database Shootout

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.14Test: In-Memory Database ShootoutNo SMEAMD SME Enabled10002000300040005000SE +/- 54.74, N = 12SE +/- 69.41, N = 34764.64838.5MIN: 4124.15 / MAX: 6577.01MIN: 4339.45 / MAX: 6109.38

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression SpeedNo SMEAMD SME Enabled1224364860SE +/- 1.03, N = 15SE +/- 0.70, N = 352.949.91. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression SpeedNo SMEAMD SME Enabled8001600240032004000SE +/- 14.95, N = 15SE +/- 1.04, N = 33825.03837.31. (CC) gcc options: -O3 -pthread -lz -llzma

srsRAN

Test: OFDM_Test

OpenBenchmarking.orgSamples / Second, More Is BettersrsRAN 22.04.1Test: OFDM_TestNo SMEAMD SME Enabled30M60M90M120M150MSE +/- 600925.21, N = 3SE +/- 883804.91, N = 31617333331626333331. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM

OpenBenchmarking.orgeNb Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAMNo SMEAMD SME Enabled90180270360450SE +/- 0.49, N = 3SE +/- 3.12, N = 3415.1408.51. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAMNo SMEAMD SME Enabled306090120150SE +/- 0.25, N = 3SE +/- 0.31, N = 3157.8157.71. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB SISO 64-QAM

OpenBenchmarking.orgeNb Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB SISO 64-QAMNo SMEAMD SME Enabled90180270360450SE +/- 0.64, N = 3SE +/- 0.45, N = 3413.9415.71. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB SISO 64-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB SISO 64-QAMNo SMEAMD SME Enabled4080120160200SE +/- 0.84, N = 3SE +/- 0.46, N = 3165.8165.91. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB MIMO 256-QAM

OpenBenchmarking.orgeNb Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB MIMO 256-QAMNo SMEAMD SME Enabled100200300400500SE +/- 1.49, N = 3SE +/- 1.01, N = 3445.2444.81. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB MIMO 256-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB MIMO 256-QAMNo SMEAMD SME Enabled4080120160200SE +/- 0.24, N = 3SE +/- 0.41, N = 3166.0165.71. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB SISO 256-QAM

OpenBenchmarking.orgeNb Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB SISO 256-QAMNo SMEAMD SME Enabled100200300400500SE +/- 1.15, N = 3SE +/- 0.03, N = 3444.0445.71. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB SISO 256-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB SISO 256-QAMNo SMEAMD SME Enabled4080120160200SE +/- 0.47, N = 3SE +/- 0.09, N = 3172.7172.21. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 5G PHY_DL_NR Test 52 PRB SISO 64-QAM

OpenBenchmarking.orgeNb Mb/s, More Is BettersrsRAN 22.04.1Test: 5G PHY_DL_NR Test 52 PRB SISO 64-QAMNo SMEAMD SME Enabled306090120150SE +/- 0.32, N = 3SE +/- 0.19, N = 3139.7139.11. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 5G PHY_DL_NR Test 52 PRB SISO 64-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 5G PHY_DL_NR Test 52 PRB SISO 64-QAMNo SMEAMD SME Enabled20406080100SE +/- 0.22, N = 3SE +/- 0.09, N = 394.494.91. (CXX) g++ options: -std=c++14 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4KNo SMEAMD SME Enabled816243240SE +/- 0.53, N = 15SE +/- 0.56, N = 1234.4733.121. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

Embree

Binary: Pathtracer ISPC - Model: Crown

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: CrownNo SMEAMD SME Enabled4080120160200SE +/- 0.65, N = 3SE +/- 0.40, N = 3183.25180.57MIN: 135.3 / MAX: 213.14MIN: 129.9 / MAX: 210

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Very FastNo SMEAMD SME Enabled1632486480SE +/- 0.95, N = 3SE +/- 0.91, N = 374.3173.321. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Ultra FastNo SMEAMD SME Enabled20406080100SE +/- 0.77, N = 3SE +/- 0.97, N = 377.6876.221. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.4Encoder Mode: Preset 13 - Input: Bosphorus 4KNo SMEAMD SME Enabled50100150200250SE +/- 6.22, N = 15SE +/- 4.08, N = 15248.33251.44

x264

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2022-02-22Video Input: Bosphorus 4KNo SMEAMD SME Enabled20406080100SE +/- 1.42, N = 3SE +/- 0.62, N = 3106.86103.071. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -m64 -lm -lpthread -O3 -flto

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4KNo SMEAMD SME Enabled612182430SE +/- 0.29, N = 4SE +/- 0.17, N = 323.4823.291. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateNo SMEAMD SME Enabled1632486480SE +/- 0.11, N = 3SE +/- 0.18, N = 370.3770.281. (CC) gcc options: -O3 -march=native -fopenmp

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096No SMEAMD SME Enabled0.37350.7471.12051.4941.8675SE +/- 0.00, N = 3SE +/- 0.00, N = 31.661.66

OpenVKL

Benchmark: vklBenchmark ISPC

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.3.1Benchmark: vklBenchmark ISPCNo SMEAMD SME Enabled30060090012001500SE +/- 6.81, N = 3SE +/- 15.55, N = 413221286MIN: 329 / MAX: 4485MIN: 328 / MAX: 5485

OSPRay

Benchmark: particle_volume/pathtracer/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: particle_volume/pathtracer/real_timeNo SMEAMD SME Enabled50100150200250SE +/- 1.25, N = 3SE +/- 1.51, N = 3230.62229.88

OSPRay

Benchmark: gravity_spheres_volume/dim_512/ao/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: gravity_spheres_volume/dim_512/ao/real_timeNo SMEAMD SME Enabled1020304050SE +/- 0.04, N = 3SE +/- 0.14, N = 343.8543.38

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression RatingNo SMEAMD SME Enabled200K400K600K800K1000KSE +/- 9930.11, N = 3SE +/- 6113.05, N = 39177828851351. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression RatingNo SMEAMD SME Enabled300K600K900K1200K1500KSE +/- 10921.46, N = 3SE +/- 7858.68, N = 3116063211690381. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 2No SMEAMD SME Enabled816243240SE +/- 0.03, N = 3SE +/- 0.42, N = 434.6935.261. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6No SMEAMD SME Enabled0.55551.1111.66652.2222.7775SE +/- 0.006, N = 3SE +/- 0.019, N = 32.3932.4691. (CXX) g++ options: -O3 -fPIC -lm

Timed Gem5 Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 21.2Time To CompileNo SMEAMD SME Enabled306090120150SE +/- 1.59, N = 3SE +/- 1.00, N = 3138.64142.18

Timed Godot Game Engine Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Godot Game Engine Compilation 3.2.3Time To CompileNo SMEAMD SME Enabled816243240SE +/- 0.48, N = 3SE +/- 0.36, N = 334.1435.04

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigNo SMEAMD SME Enabled612182430SE +/- 0.22, N = 8SE +/- 0.23, N = 725.7125.30

Timed Linux Kernel Compilation

Build: allmodconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: allmodconfigNo SMEAMD SME Enabled306090120150SE +/- 1.13, N = 3SE +/- 0.71, N = 3146.33148.44

Timed LLVM Compilation

Build System: Ninja

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 13.0Build System: NinjaNo SMEAMD SME Enabled20406080100SE +/- 0.38, N = 3SE +/- 0.35, N = 375.3376.63

Timed LLVM Compilation

Build System: Unix Makefiles

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 13.0Build System: Unix MakefilesNo SMEAMD SME Enabled4080120160200SE +/- 0.17, N = 3SE +/- 0.05, N = 3160.13162.63

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerNo SMEAMD SME Enabled5K10K15K20K25KSE +/- 6.36, N = 3SE +/- 32.58, N = 322043226141. (CXX) g++ options: -O3 -ldl

Liquid-DSP

Threads: 256 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 256 - Buffer Length: 256 - Filter Length: 57No SMEAMD SME Enabled2000M4000M6000M8000M10000MSE +/- 8082903.77, N = 3SE +/- 5206833.12, N = 310332000000103446666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 384 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 384 - Buffer Length: 256 - Filter Length: 57No SMEAMD SME Enabled2000M4000M6000M8000M10000MSE +/- 3785938.90, N = 3SE +/- 3605551.28, N = 310346000000103500000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

ASKAP

Test: tConvolve MPI - Degridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingNo SMEAMD SME Enabled20K40K60K80K100KSE +/- 368.27, N = 3SE +/- 0.00, N = 383598.378718.71. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MPI - Gridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingNo SMEAMD SME Enabled20K40K60K80K100KSE +/- 460.77, N = 3SE +/- 422.37, N = 393071.089541.81. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughNo SMEAMD SME Enabled20406080100SE +/- 0.06, N = 3SE +/- 0.03, N = 3106.42106.561. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ExhaustiveNo SMEAMD SME Enabled3691215SE +/- 0.01, N = 3SE +/- 0.01, N = 311.8211.841. (CXX) g++ options: -O3 -flto -pthread

Graph500

Scale: 26

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 26No SMEAMD SME Enabled300M600M900M1200M1500M142648000013585100001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 26No SMEAMD SME Enabled300M600M900M1200M1500M153318000015263800001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 26No SMEAMD SME Enabled130M260M390M520M650M5931530005725100001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 26No SMEAMD SME Enabled200M400M600M800M1000M8385050008354670001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_bareNo SMEAMD SME Enabled510152025SE +/- 0.03, N = 3SE +/- 0.03, N = 318.7118.621. (CXX) g++ options: -O3

PostgreSQL

Scaling Factor: 100 - Clients: 250 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 250 - Mode: Read OnlyNo SMEAMD SME Enabled600K1200K1800K2400K3000KSE +/- 16891.69, N = 3SE +/- 40566.19, N = 3295114729708691. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: AlexNetNo SMEAMD SME Enabled110220330440550SE +/- 6.01, N = 15SE +/- 7.26, N = 15508.40505.26

KTX-Software toktx

Settings: Zstd Compression 9

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: Zstd Compression 9No SMEAMD SME Enabled0.62461.24921.87382.49843.123SE +/- 0.006, N = 3SE +/- 0.006, N = 32.7342.776

KTX-Software toktx

Settings: Zstd Compression 19

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: Zstd Compression 19No SMEAMD SME Enabled510152025SE +/- 0.08, N = 3SE +/- 0.02, N = 318.8619.88

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled20406080100SE +/- 0.10, N = 3SE +/- 0.11, N = 384.2583.64

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled2004006008001000SE +/- 0.43, N = 3SE +/- 0.31, N = 31134.421143.04

Neural Magic DeepSparse

Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled160320480640800SE +/- 0.54, N = 3SE +/- 0.69, N = 3762.70745.64

Neural Magic DeepSparse

Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled306090120150SE +/- 0.08, N = 3SE +/- 0.14, N = 3125.53128.40

Neural Magic DeepSparse

Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled2004006008001000SE +/- 1.61, N = 3SE +/- 0.37, N = 3858.47840.94

Neural Magic DeepSparse

Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled306090120150SE +/- 0.16, N = 3SE +/- 0.01, N = 3111.46113.83

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled400800120016002000SE +/- 1.94, N = 3SE +/- 4.55, N = 31962.101911.40

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled1122334455SE +/- 0.05, N = 3SE +/- 0.13, N = 348.8350.11

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled30060090012001500SE +/- 2.65, N = 3SE +/- 1.22, N = 31204.851179.00

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled20406080100SE +/- 0.15, N = 3SE +/- 0.07, N = 379.4981.23

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled130260390520650SE +/- 0.96, N = 3SE +/- 0.11, N = 3617.03601.93

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled4080120160200SE +/- 0.23, N = 3SE +/- 0.02, N = 3155.10158.94

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled20406080100SE +/- 0.09, N = 3SE +/- 0.05, N = 384.2883.82

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamNo SMEAMD SME Enabled2004006008001000SE +/- 1.72, N = 3SE +/- 0.38, N = 31133.311143.01

WRF

Input: conus 2.5km

OpenBenchmarking.orgSeconds, Fewer Is BetterWRF 4.2.2Input: conus 2.5kmNo SMEAMD SME Enabled90018002700360045004077.194116.621. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 22.1Input: Carbon NanotubeNo SMEAMD SME Enabled612182430SE +/- 0.16, N = 3SE +/- 0.05, N = 323.1722.981. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.4Blend File: Classroom - Compute: CPU-OnlyNo SMEAMD SME Enabled510152025SE +/- 0.08, N = 3SE +/- 0.02, N = 320.9920.95

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.4Blend File: Barbershop - Compute: CPU-OnlyNo SMEAMD SME Enabled20406080100SE +/- 0.30, N = 3SE +/- 0.45, N = 380.7781.57

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Face Detection FP16 - Device: CPUNo SMEAMD SME Enabled20406080100SE +/- 0.24, N = 3SE +/- 0.21, N = 3101.90101.551. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Face Detection FP16 - Device: CPUNo SMEAMD SME Enabled100200300400500SE +/- 0.87, N = 3SE +/- 0.71, N = 3469.81471.53MIN: 402.31 / MAX: 539.43MIN: 415.73 / MAX: 547.471. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Detection FP16 - Device: CPUNo SMEAMD SME Enabled1020304050SE +/- 0.12, N = 3SE +/- 0.21, N = 343.2942.331. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Detection FP16 - Device: CPUNo SMEAMD SME Enabled2004006008001000SE +/- 3.23, N = 3SE +/- 5.54, N = 31102.151127.51MIN: 799.38 / MAX: 1782.76MIN: 802.53 / MAX: 1835.351. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Detection FP32 - Device: CPUNo SMEAMD SME Enabled1020304050SE +/- 0.23, N = 3SE +/- 0.02, N = 342.7642.031. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Detection FP32 - Device: CPUNo SMEAMD SME Enabled2004006008001000SE +/- 5.83, N = 3SE +/- 0.32, N = 31115.561134.68MIN: 773.47 / MAX: 1806.09MIN: 842.92 / MAX: 1806.881. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16 - Device: CPUNo SMEAMD SME Enabled16003200480064008000SE +/- 3.74, N = 3SE +/- 4.95, N = 37437.737274.981. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16 - Device: CPUNo SMEAMD SME Enabled246810SE +/- 0.00, N = 3SE +/- 0.01, N = 36.446.59MIN: 5.05 / MAX: 61.63MIN: 5 / MAX: 63.21. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Face Detection FP16-INT8 - Device: CPUNo SMEAMD SME Enabled4080120160200SE +/- 0.05, N = 3SE +/- 0.05, N = 3193.95193.831. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Face Detection FP16-INT8 - Device: CPUNo SMEAMD SME Enabled50100150200250SE +/- 0.06, N = 3SE +/- 0.05, N = 3246.95247.23MIN: 205.26 / MAX: 303.51MIN: 208.68 / MAX: 293.421. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUNo SMEAMD SME Enabled2K4K6K8K10KSE +/- 2.65, N = 3SE +/- 4.87, N = 311180.6311184.761. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUNo SMEAMD SME Enabled0.9631.9262.8893.8524.815SE +/- 0.00, N = 3SE +/- 0.00, N = 34.284.28MIN: 3.51 / MAX: 38.77MIN: 3.5 / MAX: 42.321. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16 - Device: CPUNo SMEAMD SME Enabled2K4K6K8K10KSE +/- 3.98, N = 3SE +/- 13.80, N = 39997.689990.191. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16 - Device: CPUNo SMEAMD SME Enabled1.07782.15563.23344.31125.389SE +/- 0.00, N = 3SE +/- 0.01, N = 34.794.79MIN: 3.96 / MAX: 30.92MIN: 3.95 / MAX: 30.11. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Machine Translation EN To DE FP16 - Device: CPUNo SMEAMD SME Enabled2004006008001000SE +/- 1.57, N = 3SE +/- 1.50, N = 3967.90963.141. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Machine Translation EN To DE FP16 - Device: CPUNo SMEAMD SME Enabled1122334455SE +/- 0.08, N = 3SE +/- 0.08, N = 349.5449.78MIN: 37.27 / MAX: 225.52MIN: 38.76 / MAX: 189.771. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUNo SMEAMD SME Enabled4K8K12K16K20KSE +/- 18.52, N = 3SE +/- 4.31, N = 319801.4019704.841. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUNo SMEAMD SME Enabled3691215SE +/- 0.01, N = 3SE +/- 0.00, N = 39.629.67MIN: 8.26 / MAX: 57.97MIN: 8.32 / MAX: 78.91. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUNo SMEAMD SME Enabled2K4K6K8K10KSE +/- 7.89, N = 3SE +/- 6.93, N = 39027.728993.901. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUNo SMEAMD SME Enabled1.19932.39863.59794.79725.9965SE +/- 0.01, N = 3SE +/- 0.00, N = 35.315.33MIN: 4.34 / MAX: 44.16MIN: 4.45 / MAX: 44.321. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUNo SMEAMD SME Enabled30K60K90K120K150KSE +/- 878.85, N = 3SE +/- 1824.08, N = 4150792.42148736.041. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUNo SMEAMD SME Enabled0.12380.24760.37140.49520.619SE +/- 0.00, N = 3SE +/- 0.00, N = 40.550.55MIN: 0.5 / MAX: 30.13MIN: 0.5 / MAX: 36.471. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUNo SMEAMD SME Enabled40K80K120K160K200KSE +/- 2225.36, N = 3SE +/- 702.31, N = 3165194.42167545.541. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUNo SMEAMD SME Enabled0.0810.1620.2430.3240.405SE +/- 0.00, N = 3SE +/- 0.00, N = 30.360.36MIN: 0.34 / MAX: 40.99MIN: 0.34 / MAX: 47.651. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -flto -shared

Xsbench

OpenBenchmarking.orgLookups/s, More Is BetterXsbench 2017-07-06No SMEAMD SME Enabled6M12M18M24M30MSE +/- 43563.46, N = 3SE +/- 367701.15, N = 1529806415290214281. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm

nginx

Connections: 500

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 500No SMEAMD SME Enabled40K80K120K160K200KSE +/- 238.08, N = 3SE +/- 124.29, N = 3201056.69196386.411. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: StandardNo SMEAMD SME Enabled12002400360048006000SE +/- 15.47, N = 3SE +/- 40.49, N = 3560055831. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt

Appleseed

Scene: Emily

OpenBenchmarking.orgSeconds, Fewer Is BetterAppleseed 2.0 BetaScene: EmilyNo SMEAMD SME Enabled306090120150142.95150.92

PyHPC Benchmarks

Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of State

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of StateNo SMEAMD SME Enabled0.21020.42040.63060.84081.051SE +/- 0.002, N = 3SE +/- 0.003, N = 30.8840.934

PyHPC Benchmarks

Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral Mixing

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral MixingNo SMEAMD SME Enabled0.39240.78481.17721.56961.962SE +/- 0.006, N = 3SE +/- 0.005, N = 31.6911.744

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUNo SMEAMD SME Enabled0.19430.38860.58290.77720.9715SE +/- 0.005137, N = 3SE +/- 0.004505, N = 30.8637260.850418MIN: 0.75MIN: 0.741. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUNo SMEAMD SME Enabled0.88271.76542.64813.53084.4135SE +/- 0.06074, N = 15SE +/- 0.05423, N = 33.892993.92313MIN: 2.77MIN: 2.891. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUNo SMEAMD SME Enabled0.11850.2370.35550.4740.5925SE +/- 0.001481, N = 3SE +/- 0.001428, N = 30.5220520.526628MIN: 0.42MIN: 0.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUNo SMEAMD SME Enabled612182430SE +/- 0.08, N = 3SE +/- 0.20, N = 322.6823.14MIN: 19.97MIN: 20.171. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUNo SMEAMD SME Enabled0.20670.41340.62010.82681.0335SE +/- 0.009364, N = 5SE +/- 0.004429, N = 30.9161330.918482MIN: 0.76MIN: 0.781. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUNo SMEAMD SME Enabled400800120016002000SE +/- 18.08, N = 7SE +/- 18.68, N = 62011.152002.43MIN: 1936.22MIN: 1924.961. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread


Phoronix Test Suite v10.8.4