dlda Tests for a future article. Intel Core i9-10980XE testing with a ASRock X299 Steel Legend (P1.50 BIOS) and llvmpipe on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2404061-PTS-DLDA801834&grw&sor .
dlda Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b Intel Core i9-10980XE @ 4.80GHz (18 Cores / 36 Threads) ASRock X299 Steel Legend (P1.50 BIOS) Intel Sky Lake-E DMI3 Registers 4 x 8GB DDR4-3600MT/s Samsung SSD 970 PRO 512GB llvmpipe Realtek ALC1220 Intel I219-V + Intel I211 Ubuntu 22.04 6.5.0-18-generic (x86_64) GNOME Shell 42.2 X Server 1.21.1.4 4.5 Mesa 22.0.1 (LLVM 13.0.1 256 bits) 1.2.204 GCC 11.4.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq schedutil - CPU Microcode: 0x5003604 Python Details - Python 3.10.12 Security Details - gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
dlda specfem3d: Water-layered Halfspace specfem3d: Homogeneous Halfspace specfem3d: Tomographic Model brl-cad: VGR Performance Metric tensorflow: CPU - 1 - AlexNet tensorflow: CPU - 16 - AlexNet tensorflow: CPU - 32 - AlexNet tensorflow: CPU - 64 - AlexNet tensorflow: CPU - 1 - GoogLeNet tensorflow: CPU - 1 - ResNet-50 tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - GoogLeNet tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - GoogLeNet tensorflow: CPU - 64 - ResNet-50 llamafile: llava-v1.5-7b-q4 - CPU llamafile: mistral-7b-instruct-v0.2.Q8_0 - CPU specfem3d: Layered Halfspace llamafile: wizardcoder-python-34b-v1.0.Q6_K - CPU build-ffmpeg: Time To Compile x265: Bosphorus 4K x265: Bosphorus 1080p blender: BMW27 - CPU-Only blender: Junkshop - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only ffmpeg: libx264 - Live ffmpeg: libx265 - Live ffmpeg: libx264 - Upload ffmpeg: libx265 - Upload ffmpeg: libx264 - Platform ffmpeg: libx265 - Platform ffmpeg: libx265 - Video On Demand build-mesa: Time To Compile rocksdb: Overwrite rocksdb: Rand Fill rocksdb: Rand Read rocksdb: Update Rand rocksdb: Seq Fill rocksdb: Rand Fill Sync rocksdb: Read While Writing rocksdb: Read Rand Write Rand specfem3d: Mount St. Helens ffmpeg: libx264 - Video On Demand a b 115.601201526 51.132054445 40.100782187 211860 14.37 153.21 235.74 287.38 51.39 14.49 128.18 40.06 128.1 41.22 125.69 41.59 13.31 9.49 112.489970072 2.93 47.369 20.86 29.37 91.49 117.6 264.21 120.6 931.12 313.36 178.34 41.09 11.15 9.17 41.55 17.77 17.80 26.37 1117358 1122751 74178255 633669 1307492 4670 3741170 2423455 42.510866238 41.62 114.462975473 50.527685606 40.082694716 212029 14.39 153.76 235.64 287.56 50.57 14.55 128.02 40.1 128.31 41.2 125.8 41.56 13.4 9.51 111.672788927 2.93 46.095 21.39 29.29 91.37 115.25 264.21 120.75 931.98 320.82 177.16 41.20 11.18 9.08 41.63 17.72 17.78 25.529 1121724 1128064 73039420 631933 1326770 4668 3698117 2425448 42.465093136 41.63 OpenBenchmarking.org
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Water-layered Halfspace b a 30 60 90 120 150 114.46 115.60 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Homogeneous Halfspace b a 12 24 36 48 60 50.53 51.13 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Tomographic Model b a 9 18 27 36 45 40.08 40.10 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric b a 50K 100K 150K 200K 250K 212029 211860 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
TensorFlow Device: CPU - Batch Size: 1 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: AlexNet b a 4 8 12 16 20 14.39 14.37
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: AlexNet b a 30 60 90 120 150 153.76 153.21
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: AlexNet a b 50 100 150 200 250 235.74 235.64
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: AlexNet b a 60 120 180 240 300 287.56 287.38
TensorFlow Device: CPU - Batch Size: 1 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: GoogLeNet a b 12 24 36 48 60 51.39 50.57
TensorFlow Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 b a 4 8 12 16 20 14.55 14.49
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: GoogLeNet a b 30 60 90 120 150 128.18 128.02
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 b a 9 18 27 36 45 40.10 40.06
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: GoogLeNet b a 30 60 90 120 150 128.31 128.10
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 a b 9 18 27 36 45 41.22 41.20
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: GoogLeNet b a 30 60 90 120 150 125.80 125.69
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 a b 9 18 27 36 45 41.59 41.56
Llamafile Test: llava-v1.5-7b-q4 - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: llava-v1.5-7b-q4 - Acceleration: CPU b a 3 6 9 12 15 13.40 13.31
Llamafile Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU b a 3 6 9 12 15 9.51 9.49
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Layered Halfspace b a 30 60 90 120 150 111.67 112.49 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Llamafile Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU b a 0.6593 1.3186 1.9779 2.6372 3.2965 2.93 2.93
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 7.0 Time To Compile b a 11 22 33 44 55 46.10 47.37
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.6 Video Input: Bosphorus 4K b a 5 10 15 20 25 21.39 20.86 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.6 Video Input: Bosphorus 1080p a b 7 14 21 28 35 29.37 29.29 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: BMW27 - Compute: CPU-Only b a 20 40 60 80 100 91.37 91.49
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Junkshop - Compute: CPU-Only b a 30 60 90 120 150 115.25 117.60
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Classroom - Compute: CPU-Only a b 60 120 180 240 300 264.21 264.21
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Fishy Cat - Compute: CPU-Only a b 30 60 90 120 150 120.60 120.75
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Barbershop - Compute: CPU-Only a b 200 400 600 800 1000 931.12 931.98
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 70 140 210 280 350 313.36 320.82
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx264 - Scenario: Live a b 40 80 120 160 200 178.34 177.16 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Live b a 9 18 27 36 45 41.20 41.09 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx264 - Scenario: Upload b a 3 6 9 12 15 11.18 11.15 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Upload a b 3 6 9 12 15 9.17 9.08 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx264 - Scenario: Platform b a 9 18 27 36 45 41.63 41.55 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Platform a b 4 8 12 16 20 17.77 17.72 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Video On Demand a b 4 8 12 16 20 17.80 17.78 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 24.0 Time To Compile b a 6 12 18 24 30 25.53 26.37
RocksDB Test: Overwrite OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Overwrite b a 200K 400K 600K 800K 1000K 1121724 1117358 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Fill b a 200K 400K 600K 800K 1000K 1128064 1122751 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read a b 16M 32M 48M 64M 80M 74178255 73039420 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random a b 140K 280K 420K 560K 700K 633669 631933 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Sequential Fill b a 300K 600K 900K 1200K 1500K 1326770 1307492 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Fill Sync a b 1000 2000 3000 4000 5000 4670 4668 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing a b 800K 1600K 2400K 3200K 4000K 3741170 3698117 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random b a 500K 1000K 1500K 2000K 2500K 2425448 2423455 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Mount St. Helens b a 10 20 30 40 50 42.47 42.51 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx264 - Scenario: Video On Demand b a 9 18 27 36 45 41.63 41.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Phoronix Test Suite v10.8.5