epyc-75f3-new

2 x AMD EPYC 75F3 32-Core testing with a ASRockRack ROME2D16-2T (P3.30 BIOS) and ASPEED on Ubuntu 21.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2204097-NE-EPYC75F3N46.

epyc-75f3-new ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionAAABC2 x AMD EPYC 75F3 32-Core @ 2.95GHz (64 Cores / 128 Threads)ASRockRack ROME2D16-2T (P3.30 BIOS)AMD Starship/Matisse128GB1000GB Western Digital WD_BLACK SN850 1TBASPEEDAMD Starship/Matisse2 x Intel 10G X550TUbuntu 21.105.17.0-051700rc4daily20220219-generic (x86_64)GNOME Shell 40.5X Server1.1.182GCC 11.2.0ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001114 Python Details- Python 3.9.7Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

epyc-75f3-new perf-bench: Epoll Waitperf-bench: Futex Hashperf-bench: Memcpy 1MBperf-bench: Memset 1MBperf-bench: Sched Pipeperf-bench: Futex Lock-Piperf-bench: Syscall Basicavifenc: 0avifenc: 2avifenc: 6avifenc: 6, Losslessavifenc: 10, Losslessonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: bertsquad-12 - CPU - Parallelonnx: bertsquad-12 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - StandardAAABC2352297679242.86838962.455072345990771705162169.46038.7844.2787.2155.0102.941582.639124.055690.7470840.70538011.23612.127670.8972881.683910.8785533874.181693.033819.451624.24611.512142299297671242.82982963.415414351368771705220269.41138.7284.0937.3895.2293.77922.548114.258650.8156320.70881211.13952.104071.247011.075150.9046624530.241803.514416.222016.9417.55344276.71711.6122.906113081323443435559454012317012251194432046262572297913444.14791663.789341346133801694187970.32737.5663.9867.5145.1973.180912.566434.772920.7578260.68479911.21741.946320.9288931.8530.8034264336.251810.374623.031801.2912.49654464.121734.9227.343612251310643629159264312315812351244424277622062294549442.92648563.185354339461771692459470.64639.5734.2157.1854.9713.854442.544042.558610.743010.68958311.19421.996390.784431.237530.8292174224.871713.394390.231696.621.67183862.481721.1229.1013124290944363165925391241521229127942334488OpenBenchmarking.org

perf-bench

Benchmark: Epoll Wait

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Epoll WaitAAABC6001200180024003000SE +/- 23.81, N = 1223522299257220621. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lslang -lz -lnuma

perf-bench

Benchmark: Futex Hash

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Futex HashAAABC600K1200K1800K2400K3000KSE +/- 6620.62, N = 329767922976712297913429454941. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lslang -lz -lnuma

perf-bench

Benchmark: Memcpy 1MB

OpenBenchmarking.orgGB/sec, More Is Betterperf-benchBenchmark: Memcpy 1MBAAABC1020304050SE +/- 0.17, N = 342.8742.8344.1542.931. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lslang -lz -lnuma

perf-bench

Benchmark: Memset 1MB

OpenBenchmarking.orgGB/sec, More Is Betterperf-benchBenchmark: Memset 1MBAAABC1428425670SE +/- 0.65, N = 562.4663.4263.7963.191. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lslang -lz -lnuma

perf-bench

Benchmark: Sched Pipe

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Sched PipeAAABC80K160K240K320K400KSE +/- 3279.19, N = 33459903513683461333394611. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lslang -lz -lnuma

perf-bench

Benchmark: Futex Lock-Pi

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Futex Lock-PiAAABC20406080100SE +/- 0.58, N = 3777780771. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lslang -lz -lnuma

perf-bench

Benchmark: Syscall Basic

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Syscall BasicAAABC4M8M12M16M20MSE +/- 7517.61, N = 3170516211705220216941879169245941. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lslang -lz -lnuma

libavif avifenc

Encoder Speed: 0

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 0AAABC1632486480SE +/- 0.35, N = 369.4669.4170.3370.651. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 2AAABC918273645SE +/- 0.11, N = 338.7838.7337.5739.571. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 6AAABC0.96261.92522.88783.85044.813SE +/- 0.011, N = 34.2784.0933.9864.2151. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 6, LosslessAAABC246810SE +/- 0.066, N = 77.2157.3897.5147.1851. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 10, LosslessAAABC1.17652.3533.52954.7065.8825SE +/- 0.030, N = 35.0105.2295.1974.9711. (CXX) g++ options: -O3 -fPIC -lm

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUAAABC0.86721.73442.60163.46884.336SE +/- 0.19680, N = 152.941583.779203.180913.85444MIN: 1.53MIN: 2.55MIN: 2.24MIN: 2.651. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUAAABC0.59381.18761.78142.37522.969SE +/- 0.01578, N = 32.639122.548112.566432.54404MIN: 2.08MIN: 2.06MIN: 2.05MIN: 2.051. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUAAABC1.07392.14783.22174.29565.3695SE +/- 0.14420, N = 154.055694.258654.772922.55861MIN: 2.07MIN: 2.63MIN: 2.83MIN: 1.851. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUAAABC0.18350.3670.55050.7340.9175SE +/- 0.011604, N = 150.7470840.8156320.7578260.743010MIN: 0.57MIN: 0.62MIN: 0.62MIN: 0.611. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAAABC0.15950.3190.47850.6380.7975SE +/- 0.006925, N = 30.7053800.7088120.6847990.689583MIN: 0.64MIN: 0.67MIN: 0.64MIN: 0.641. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUAAABC3691215SE +/- 0.09, N = 311.2411.1411.2211.19MIN: 8.6MIN: 9MIN: 9MIN: 8.841. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUAAABC0.47870.95741.43611.91482.3935SE +/- 0.02508, N = 32.127672.104071.946321.99639MIN: 1.72MIN: 1.79MIN: 1.74MIN: 1.731. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUAAABC0.28060.56120.84181.12241.403SE +/- 0.048482, N = 150.8972881.2470100.9288930.784430MIN: 0.51MIN: 0.56MIN: 0.54MIN: 0.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUAAABC0.41690.83381.25071.66762.0845SE +/- 0.09292, N = 151.683911.075151.853001.23753MIN: 0.87MIN: 0.92MIN: 1.67MIN: 1.091. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUAAABC0.20350.4070.61050.8141.0175SE +/- 0.011448, N = 120.8785530.9046620.8034260.829217MIN: 0.65MIN: 0.68MIN: 0.57MIN: 0.61. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAAABC10002000300040005000SE +/- 239.78, N = 143874.184530.244336.254224.87MIN: 1544.77MIN: 3953.07MIN: 3938.66MIN: 3842.951. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAAABC400800120016002000SE +/- 64.86, N = 121693.031803.511810.371713.39MIN: 1036.24MIN: 1650.67MIN: 1535.85MIN: 1504.131. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUAAABC10002000300040005000SE +/- 242.60, N = 123819.454416.224623.034390.23MIN: 1342.54MIN: 4047.45MIN: 4242.04MIN: 3742.531. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUAAABC400800120016002000SE +/- 88.05, N = 121624.252016.941801.291696.60MIN: 827.28MIN: 1858.23MIN: 1534.31MIN: 1543.981. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAAABC510152025SE +/- 1.25, N = 1511.5117.5512.5021.67MIN: 5.15MIN: 12.56MIN: 8.99MIN: 15.661. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUAABC100020003000400050004276.704464.123862.48MIN: 3959.8MIN: 4162.92MIN: 3242.511. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUAABC4008001200160020001711.611734.921721.12MIN: 1495.43MIN: 1463.54MIN: 1603.971. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUAABC71421283522.9127.3429.10MIN: 16.94MIN: 19.51MIN: 20.221. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: ParallelAABC300600900120015001308122512421. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: StandardAABC3K6K9K12K15K132341310690941. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: yolov4 - Device: CPU - Executor: ParallelAABC901802703604504344364361. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: yolov4 - Device: CPU - Executor: StandardAABC801602403204003552913161. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: ParallelAABC1302603905206505945925921. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: StandardAABC1402804205607005406435391. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: ParallelAABC3060901201501231231241. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: StandardAABC40801201602001701581521. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: ParallelAABC300600900120015001225123512291. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: StandardAABC300600900120015001194124412791. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: ParallelAABC90018002700360045004320424242331. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: StandardAABC170034005100680085004626776244881. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt


Phoronix Test Suite v10.8.4