9684x ne

2 x AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2310151-NE-9684XNE6190
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

Creator Workloads 4 Tests
Game Development 2 Tests
Multi-Core 4 Tests
Intel oneAPI 4 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
a
October 15 2023
  2 Hours, 49 Minutes
b
October 15 2023
  41 Minutes
Invert Hiding All Results Option
  1 Hour, 45 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


9684x neOpenBenchmarking.orgPhoronix Test Suite2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads)AMD Titanite_4G (RTI1007B BIOS)AMD Device 14a41520GB3201GB Micron_7450_MTFDKCC3T2TFSASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 23.106.6.0-060600rc1-generic (x86_64)GNOME ShellX Server 1.21.1.7GCC 13.2.0ext41920x1200ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen Resolution9684x Ne BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-nEN1TP/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-nEN1TP/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

a vs. b ComparisonPhoronix Test SuiteBaseline+12%+12%+24%+24%+36%+36%19.5%16.1%5%4.1%3.3%IP Shapes 1D - u8s8f32 - CPU47.9%IP Shapes 1D - f32 - CPUe.G.B.S - 240C.B.S.A - f32 - CPU5.9%IP Shapes 1D - bf16bf16bf16 - CPU5.5%IP Shapes 3D - f32 - CPUIP Shapes 3D - u8s8f32 - CPUC.B.S.A - bf16bf16bf16 - CPU3.8%R.N.N.I - f32 - CPUIP Shapes 3D - bf16bf16bf16 - CPU3.2%R.N.N.I - bf16bf16bf16 - CPU2.9%D.B.s - f32 - CPU2.1%C.B.S.A - u8s8f32 - CPU2%oneDNNoneDNNeasyWaveoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNab

9684x neeasywave: e2Asean Grid + BengkuluSept2007 Source - 240easywave: e2Asean Grid + BengkuluSept2007 Source - 1200easywave: e2Asean Grid + BengkuluSept2007 Source - 2400embree: Pathtracer - Crownembree: Pathtracer ISPC - Crownembree: Pathtracer - Asian Dragonembree: Pathtracer - Asian Dragon Objembree: Pathtracer ISPC - Asian Dragonembree: Pathtracer ISPC - Asian Dragon Objoidn: RT.hdr_alb_nrm.3840x2160 - CPU-Onlyoidn: RT.ldr_alb_nrm.3840x2160 - CPU-Onlyoidn: RTLightmap.hdr.4096x4096 - CPU-Onlyopenvkl: vklBenchmarkCPU ISPCopenvkl: vklBenchmarkCPU Scalaronednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUab3.57440.48397.678192.4860200.4923212.4785188.6699234.8068201.53423.783.801.89353014948.241954.4065111.458070.85188732.14853.447280.48080330.36530.9487930.3198100.7773160.2756631862.612131.261817.580.3724871.568830.6258112070.841860.352036.883.07939.75297.448192.1728199.7125212.0222189.7142233.9207200.79993.803.781.90352914946.896514.1977716.94960.81826533.91123.557420.50918530.55880.9691450.3261610.77160.2784551852.732062.291798.180.3868261.542480.6170572104.981831.262096.03OpenBenchmarking.org

easyWave

The easyWave software allows simulating tsunami generation and propagation in the context of early warning systems. EasyWave supports making use of OpenMP for CPU multi-threading and there are also GPU ports available but not currently incorporated as part of this test profile. The easyWave tsunami generation software is run with one of the example/reference input files for measuring the CPU execution time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 240ab0.80421.60842.41263.21684.021SE +/- 0.189, N = 153.5743.0791. (CXX) g++ options: -O3 -fopenmp
OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 240ab246810Min: 2.99 / Avg: 3.57 / Max: 5.231. (CXX) g++ options: -O3 -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200ab918273645SE +/- 2.18, N = 1240.4839.751. (CXX) g++ options: -O3 -fopenmp
OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200ab816243240Min: 33.87 / Avg: 40.48 / Max: 61.11. (CXX) g++ options: -O3 -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400ab20406080100SE +/- 3.79, N = 1597.6897.451. (CXX) g++ options: -O3 -fopenmp
OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400ab20406080100Min: 80.89 / Avg: 97.68 / Max: 127.341. (CXX) g++ options: -O3 -fopenmp

Embree

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer - Model: Crownab4080120160200SE +/- 0.47, N = 3192.49192.17MIN: 181.49 / MAX: 212.32MIN: 183.96 / MAX: 208.29
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer - Model: Crownab4080120160200Min: 192 / Avg: 192.49 / Max: 193.43

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Crownab4080120160200SE +/- 0.51, N = 3200.49199.71MIN: 188.57 / MAX: 219.64MIN: 188.89 / MAX: 218.34
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Crownab4080120160200Min: 199.71 / Avg: 200.49 / Max: 201.46

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer - Model: Asian Dragonab50100150200250SE +/- 0.40, N = 3212.48212.02MIN: 203.46 / MAX: 225.48MIN: 205.37 / MAX: 221.09
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer - Model: Asian Dragonab4080120160200Min: 211.69 / Avg: 212.48 / Max: 212.97

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer - Model: Asian Dragon Objab4080120160200SE +/- 1.11, N = 3188.67189.71MIN: 178.28 / MAX: 200.84MIN: 183.29 / MAX: 200.72
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer - Model: Asian Dragon Objab306090120150Min: 186.46 / Avg: 188.67 / Max: 189.98

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian Dragonab50100150200250SE +/- 0.31, N = 3234.81233.92MIN: 223.76 / MAX: 253.29MIN: 224.01 / MAX: 248.26
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian Dragonab4080120160200Min: 234.44 / Avg: 234.81 / Max: 235.42

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian Dragon Objab4080120160200SE +/- 0.75, N = 3201.53200.80MIN: 189 / MAX: 214.5MIN: 190.91 / MAX: 211.99
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian Dragon Objab4080120160200Min: 200.09 / Avg: 201.53 / Max: 202.64

Intel Open Image Denoise

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Onlyab0.8551.712.5653.424.275SE +/- 0.02, N = 33.783.80
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Onlyab246810Min: 3.75 / Avg: 3.78 / Max: 3.8

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Onlyab0.8551.712.5653.424.275SE +/- 0.01, N = 33.803.78
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Onlyab246810Min: 3.79 / Avg: 3.8 / Max: 3.82

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RTLightmap.hdr.4096x4096 - Device: CPU-Onlyab0.42750.8551.28251.712.1375SE +/- 0.00, N = 31.891.90
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RTLightmap.hdr.4096x4096 - Device: CPU-Onlyab246810Min: 1.89 / Avg: 1.89 / Max: 1.9

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU ISPCab8001600240032004000SE +/- 1.20, N = 335303529MIN: 303 / MAX: 39315MIN: 305 / MAX: 39378
OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU ISPCab6001200180024003000Min: 3528 / Avg: 3529.67 / Max: 3532

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU Scalarab30060090012001500SE +/- 1.15, N = 314941494MIN: 113 / MAX: 23868MIN: 114 / MAX: 23822
OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU Scalarab30060090012001500Min: 1492 / Avg: 1494 / Max: 1496

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUab246810SE +/- 0.61719, N = 158.241956.89651MIN: 3.98MIN: 5.331. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUab3691215Min: 4.58 / Avg: 8.24 / Max: 11.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUab0.99151.9832.97453.9664.9575SE +/- 0.01148, N = 34.406514.19777MIN: 3.53MIN: 3.461. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUab246810Min: 4.38 / Avg: 4.41 / Max: 4.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUab48121620SE +/- 1.01, N = 1511.4616.95MIN: 4.44MIN: 11.661. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUab48121620Min: 4.88 / Avg: 11.46 / Max: 17.571. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUab0.19170.38340.57510.76680.9585SE +/- 0.007085, N = 30.8518870.818265MIN: 0.7MIN: 0.691. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUab246810Min: 0.84 / Avg: 0.85 / Max: 0.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUab816243240SE +/- 0.61, N = 1532.1533.91MIN: 20.73MIN: 25.571. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUab714212835Min: 28.52 / Avg: 32.15 / Max: 34.441. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUab0.80041.60082.40123.20164.002SE +/- 0.02918, N = 153.447283.55742MIN: 2.55MIN: 2.681. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUab246810Min: 3.17 / Avg: 3.45 / Max: 3.61. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUab0.11460.22920.34380.45840.573SE +/- 0.004181, N = 30.4808030.509185MIN: 0.41MIN: 0.451. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUab246810Min: 0.47 / Avg: 0.48 / Max: 0.491. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUab714212835SE +/- 0.11, N = 330.3730.56MIN: 28.2MIN: 28.841. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUab714212835Min: 30.14 / Avg: 30.37 / Max: 30.531. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUab0.21810.43620.65430.87241.0905SE +/- 0.003774, N = 30.9487930.969145MIN: 0.88MIN: 0.881. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUab246810Min: 0.94 / Avg: 0.95 / Max: 0.961. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUab0.07340.14680.22020.29360.367SE +/- 0.002729, N = 150.3198100.326161MIN: 0.24MIN: 0.251. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUab12345Min: 0.31 / Avg: 0.32 / Max: 0.341. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUab0.17490.34980.52470.69960.8745SE +/- 0.004465, N = 30.7773160.771600MIN: 0.57MIN: 0.571. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUab246810Min: 0.77 / Avg: 0.78 / Max: 0.791. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUab0.06270.12540.18810.25080.3135SE +/- 0.001684, N = 30.2756630.278455MIN: 0.26MIN: 0.271. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUab12345Min: 0.27 / Avg: 0.28 / Max: 0.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUab400800120016002000SE +/- 14.79, N = 91862.611852.73MIN: 1754.2MIN: 1829.71. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUab30060090012001500Min: 1778.02 / Avg: 1862.61 / Max: 1928.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUab5001000150020002500SE +/- 13.26, N = 32131.262062.29MIN: 2090.72MIN: 2023.871. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUab400800120016002000Min: 2108.78 / Avg: 2131.26 / Max: 2154.71. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUab400800120016002000SE +/- 24.34, N = 31817.581798.18MIN: 1756.95MIN: 1779.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUab30060090012001500Min: 1773.09 / Avg: 1817.58 / Max: 1856.931. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUab0.0870.1740.2610.3480.435SE +/- 0.004142, N = 40.3724870.386826MIN: 0.33MIN: 0.351. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUab12345Min: 0.37 / Avg: 0.37 / Max: 0.381. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUab0.3530.7061.0591.4121.765SE +/- 0.01413, N = 31.568831.54248MIN: 1.26MIN: 1.321. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUab246810Min: 1.55 / Avg: 1.57 / Max: 1.591. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUab0.14080.28160.42240.56320.704SE +/- 0.002542, N = 30.6258110.617057MIN: 0.55MIN: 0.551. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUab246810Min: 0.62 / Avg: 0.63 / Max: 0.631. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUab5001000150020002500SE +/- 25.85, N = 32070.842104.98MIN: 2009.72MIN: 2079.451. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUab400800120016002000Min: 2025.86 / Avg: 2070.84 / Max: 2115.391. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUab400800120016002000SE +/- 17.10, N = 31860.351831.26MIN: 1812.41MIN: 1812.151. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUab30060090012001500Min: 1830.72 / Avg: 1860.35 / Max: 1889.961. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUab400800120016002000SE +/- 13.33, N = 32036.882096.03MIN: 1993.78MIN: 2065.341. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUab400800120016002000Min: 2011.75 / Avg: 2036.88 / Max: 2057.151. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

35 Results Shown

easyWave:
  e2Asean Grid + BengkuluSept2007 Source - 240
  e2Asean Grid + BengkuluSept2007 Source - 1200
  e2Asean Grid + BengkuluSept2007 Source - 2400
Embree:
  Pathtracer - Crown
  Pathtracer ISPC - Crown
  Pathtracer - Asian Dragon
  Pathtracer - Asian Dragon Obj
  Pathtracer ISPC - Asian Dragon
  Pathtracer ISPC - Asian Dragon Obj
Intel Open Image Denoise:
  RT.hdr_alb_nrm.3840x2160 - CPU-Only
  RT.ldr_alb_nrm.3840x2160 - CPU-Only
  RTLightmap.hdr.4096x4096 - CPU-Only
OpenVKL:
  vklBenchmarkCPU ISPC
  vklBenchmarkCPU Scalar
oneDNN:
  IP Shapes 1D - f32 - CPU
  IP Shapes 3D - f32 - CPU
  IP Shapes 1D - u8s8f32 - CPU
  IP Shapes 3D - u8s8f32 - CPU
  IP Shapes 1D - bf16bf16bf16 - CPU
  IP Shapes 3D - bf16bf16bf16 - CPU
  Convolution Batch Shapes Auto - f32 - CPU
  Deconvolution Batch shapes_1d - f32 - CPU
  Deconvolution Batch shapes_3d - f32 - CPU
  Convolution Batch Shapes Auto - u8s8f32 - CPU
  Deconvolution Batch shapes_1d - u8s8f32 - CPU
  Deconvolution Batch shapes_3d - u8s8f32 - CPU
  Recurrent Neural Network Training - f32 - CPU
  Recurrent Neural Network Inference - f32 - CPU
  Recurrent Neural Network Training - u8s8f32 - CPU
  Convolution Batch Shapes Auto - bf16bf16bf16 - CPU
  Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU
  Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU
  Recurrent Neural Network Inference - u8s8f32 - CPU
  Recurrent Neural Network Training - bf16bf16bf16 - CPU
  Recurrent Neural Network Inference - bf16bf16bf16 - CPU