dlda Tests for a future article. Intel Core i9-10980XE testing with a ASRock X299 Steel Legend (P1.50 BIOS) and llvmpipe on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2404061-PTS-DLDA801834&grs&rdt .
dlda Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b Intel Core i9-10980XE @ 4.80GHz (18 Cores / 36 Threads) ASRock X299 Steel Legend (P1.50 BIOS) Intel Sky Lake-E DMI3 Registers 4 x 8GB DDR4-3600MT/s Samsung SSD 970 PRO 512GB llvmpipe Realtek ALC1220 Intel I219-V + Intel I211 Ubuntu 22.04 6.5.0-18-generic (x86_64) GNOME Shell 42.2 X Server 1.21.1.4 4.5 Mesa 22.0.1 (LLVM 13.0.1 256 bits) 1.2.204 GCC 11.4.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq schedutil - CPU Microcode: 0x5003604 Python Details - Python 3.10.12 Security Details - gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
dlda build-mesa: Time To Compile build-ffmpeg: Time To Compile x265: Bosphorus 4K blender: Pabellon Barcelona - CPU-Only blender: Junkshop - CPU-Only tensorflow: CPU - 1 - GoogLeNet rocksdb: Rand Read rocksdb: Seq Fill specfem3d: Homogeneous Halfspace rocksdb: Read While Writing specfem3d: Water-layered Halfspace ffmpeg: libx265 - Upload specfem3d: Layered Halfspace llamafile: llava-v1.5-7b-q4 - CPU ffmpeg: libx264 - Live rocksdb: Rand Fill tensorflow: CPU - 1 - ResNet-50 rocksdb: Overwrite tensorflow: CPU - 16 - AlexNet ffmpeg: libx265 - Platform rocksdb: Update Rand x265: Bosphorus 1080p ffmpeg: libx264 - Upload ffmpeg: libx265 - Live llamafile: mistral-7b-instruct-v0.2.Q8_0 - CPU ffmpeg: libx264 - Platform tensorflow: CPU - 32 - GoogLeNet tensorflow: CPU - 1 - AlexNet blender: BMW27 - CPU-Only tensorflow: CPU - 16 - GoogLeNet blender: Fishy Cat - CPU-Only ffmpeg: libx265 - Video On Demand specfem3d: Mount St. Helens tensorflow: CPU - 16 - ResNet-50 blender: Barbershop - CPU-Only tensorflow: CPU - 64 - GoogLeNet rocksdb: Read Rand Write Rand brl-cad: VGR Performance Metric tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 64 - AlexNet tensorflow: CPU - 32 - ResNet-50 specfem3d: Tomographic Model rocksdb: Rand Fill Sync tensorflow: CPU - 32 - AlexNet ffmpeg: libx264 - Video On Demand llamafile: wizardcoder-python-34b-v1.0.Q6_K - CPU blender: Classroom - CPU-Only a b 26.37 47.369 20.86 313.36 117.6 51.39 74178255 1307492 51.132054445 3741170 115.601201526 9.17 112.489970072 13.31 178.34 1122751 14.49 1117358 153.21 17.77 633669 29.37 11.15 41.09 9.49 41.55 128.1 14.37 91.49 128.18 120.6 17.80 42.510866238 40.06 931.12 125.69 2423455 211860 41.59 287.38 41.22 40.100782187 4670 235.74 41.62 2.93 264.21 25.529 46.095 21.39 320.82 115.25 50.57 73039420 1326770 50.527685606 3698117 114.462975473 9.08 111.672788927 13.4 177.16 1128064 14.55 1121724 153.76 17.72 631933 29.29 11.18 41.20 9.51 41.63 128.31 14.39 91.37 128.02 120.75 17.78 42.465093136 40.1 931.98 125.8 2425448 212029 41.56 287.56 41.2 40.082694716 4668 235.64 41.63 2.93 264.21 OpenBenchmarking.org
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 24.0 Time To Compile a b 6 12 18 24 30 26.37 25.53
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 7.0 Time To Compile a b 11 22 33 44 55 47.37 46.10
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.6 Video Input: Bosphorus 4K a b 5 10 15 20 25 20.86 21.39 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 70 140 210 280 350 313.36 320.82
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Junkshop - Compute: CPU-Only a b 30 60 90 120 150 117.60 115.25
TensorFlow Device: CPU - Batch Size: 1 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: GoogLeNet a b 12 24 36 48 60 51.39 50.57
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read a b 16M 32M 48M 64M 80M 74178255 73039420 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Sequential Fill a b 300K 600K 900K 1200K 1500K 1307492 1326770 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Homogeneous Halfspace a b 12 24 36 48 60 51.13 50.53 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing a b 800K 1600K 2400K 3200K 4000K 3741170 3698117 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Water-layered Halfspace a b 30 60 90 120 150 115.60 114.46 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Upload a b 3 6 9 12 15 9.17 9.08 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Layered Halfspace a b 30 60 90 120 150 112.49 111.67 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Llamafile Test: llava-v1.5-7b-q4 - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: llava-v1.5-7b-q4 - Acceleration: CPU a b 3 6 9 12 15 13.31 13.40
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx264 - Scenario: Live a b 40 80 120 160 200 178.34 177.16 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Fill a b 200K 400K 600K 800K 1000K 1122751 1128064 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
TensorFlow Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 a b 4 8 12 16 20 14.49 14.55
RocksDB Test: Overwrite OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Overwrite a b 200K 400K 600K 800K 1000K 1117358 1121724 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: AlexNet a b 30 60 90 120 150 153.21 153.76
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Platform a b 4 8 12 16 20 17.77 17.72 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random a b 140K 280K 420K 560K 700K 633669 631933 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.6 Video Input: Bosphorus 1080p a b 7 14 21 28 35 29.37 29.29 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx264 - Scenario: Upload a b 3 6 9 12 15 11.15 11.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Live a b 9 18 27 36 45 41.09 41.20 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Llamafile Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU a b 3 6 9 12 15 9.49 9.51
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx264 - Scenario: Platform a b 9 18 27 36 45 41.55 41.63 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: GoogLeNet a b 30 60 90 120 150 128.10 128.31
TensorFlow Device: CPU - Batch Size: 1 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: AlexNet a b 4 8 12 16 20 14.37 14.39
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: BMW27 - Compute: CPU-Only a b 20 40 60 80 100 91.49 91.37
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: GoogLeNet a b 30 60 90 120 150 128.18 128.02
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Fishy Cat - Compute: CPU-Only a b 30 60 90 120 150 120.60 120.75
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Video On Demand a b 4 8 12 16 20 17.80 17.78 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Mount St. Helens a b 10 20 30 40 50 42.51 42.47 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 a b 9 18 27 36 45 40.06 40.10
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Barbershop - Compute: CPU-Only a b 200 400 600 800 1000 931.12 931.98
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: GoogLeNet a b 30 60 90 120 150 125.69 125.80
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random a b 500K 1000K 1500K 2000K 2500K 2423455 2425448 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric a b 50K 100K 150K 200K 250K 211860 212029 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 a b 9 18 27 36 45 41.59 41.56
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: AlexNet a b 60 120 180 240 300 287.38 287.56
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 a b 9 18 27 36 45 41.22 41.20
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Tomographic Model a b 9 18 27 36 45 40.10 40.08 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Fill Sync a b 1000 2000 3000 4000 5000 4670 4668 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: AlexNet a b 50 100 150 200 250 235.74 235.64
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx264 - Scenario: Video On Demand a b 9 18 27 36 45 41.62 41.63 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Llamafile Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU a b 0.6593 1.3186 1.9779 2.6372 3.2965 2.93 2.93
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Classroom - Compute: CPU-Only a b 60 120 180 240 300 264.21 264.21
Phoronix Test Suite v10.8.5