AMD Ryzen 9 7950X 16-Core testing with a ASUS TUF GAMING X670E-PLUS WIFI (0613 BIOS) and Gigabyte NVIDIA GeForce RTX 4090 24GB on Ubuntu 22.04 via the Phoronix Test Suite.
RTX 3090 Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201016Graphics Notes: BAR1 / Visible vRAM Size: 32768 MiBOpenCL Notes: GPU Compute Cores: 10496Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
NVIDIA RTX 3090 NVIDIA 3090 Processor: AMD Ryzen 9 5950X 16-Core @ 3.40GHz (16 Cores / 32 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3801 BIOS), Chipset: AMD Starship/Matisse, Memory: 32GB, Disk: 1000GB Sabrent Rocket 4.0 Plus, Graphics: NVIDIA GeForce RTX 3090 24GB, Audio: NVIDIA GA102 HD Audio, Monitor: ASUS MG28U, Network: Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200
OS: Ubuntu 21.10, Kernel: 5.13.0-22-generic (x86_64), Desktop: GNOME Shell 40.5, Display Server: X Server 1.20.13, Display Driver: NVIDIA 495.44, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 11.5.100, Vulkan: 1.2.186, Compiler: GCC 11.2.0 + Clang 13.0.0-2, File-System: ext4, Screen Resolution: 3840x2160
RTX 4090 Processor: AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads) , Motherboard: ASUS TUF GAMING X670E-PLUS WIFI (0613 BIOS) , Chipset: AMD Device 14d8 , Memory: 32GB, Disk: 2048GB XPG GAMMIX S70 BLADE + 4001GB SSD 870 QVO 4TB , Graphics: Gigabyte NVIDIA GeForce RTX 4090 24GB , Audio: NVIDIA Device 22ba , Monitor: PI-KVM Video , Network: Realtek RTL8125 2.5GbE + MEDIATEK Device 0608
OS: Ubuntu 22.04, Kernel: 5.15.0-25-generic (x86_64), Desktop: GNOME Shell 42.4, Display Server: X Server 1.21.1.3, Display Driver: NVIDIA 520.56.06, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 11.8.87, Vulkan: 1.3.205, Compiler: GCC 13.0.0 20221013 + Clang 14.0.0-1ubuntu1, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-multilib --enable-checking=release --enable-languages=c,c++ --host=x86_64-linux-gnu --target=x86_64-linux-gnu -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203Graphics Notes: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.18.00.d2OpenCL Notes: GPU Compute Cores: 16384Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
ArrayFire ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.3341 0.6682 1.0023 1.3364 1.6705 SE +/- 0.0055, N = 3 SE +/- 0.0019, N = 3 1.4700 1.4700 1.4850 0.9347 1. (CXX) g++ options: -rdynamic
Betsy GPU Compressor Betsy is an open-source GPU compressor of various GPU compression techniques. Betsy is written in GLSL for Vulkan/OpenGL (compute shader) support for GPU-based texture compression. Learn more via the OpenBenchmarking.org test page.
Codec: ETC1 - Quality: Highest
RTX 3090: ./betsy: 3: ./betsy: not found
NVIDIA RTX 3090: ./betsy: 3: ./betsy: not found
NVIDIA 3090: ./betsy: 3: ./betsy: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found
Codec: ETC2 RGB - Quality: Highest
RTX 3090: ./betsy: 3: ./betsy: not found
NVIDIA RTX 3090: ./betsy: 3: ./betsy: not found
NVIDIA 3090: ./betsy: 3: ./betsy: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found
Blender Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: BMW27 - Compute: CUDA RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 3.94, N = 15 11.04 11.02 11.09 10.25
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: Classroom - Compute: CUDA RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 22.76 22.69 22.64 13.30
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: Fishy Cat - Compute: CUDA RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 22.86 22.87 22.78 12.59
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: Barbershop - Compute: CUDA RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20 40 60 80 100 SE +/- 0.17, N = 3 SE +/- 0.17, N = 3 91.13 91.29 90.36 49.31
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: BMW27 - Compute: NVIDIA OptiX RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.04, N = 15 6.94 6.91 6.92 4.48
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: Classroom - Compute: NVIDIA OptiX RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 17.40 17.43 17.98 11.13
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: Fishy Cat - Compute: NVIDIA OptiX RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.05, N = 13 11.22 11.32 11.22 6.05
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: Barbershop - Compute: NVIDIA OptiX RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 13 26 39 52 65 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 56.19 56.35 56.34 33.55
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: Pabellon Barcelona - Compute: CUDA RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 48.11 48.18 48.09 22.22
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.0 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 17.96 18.00 17.96 11.06
Caffe This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.
Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100
RTX 3090: @ 0x7f673b18935f google::LogMessageFatal::~LogMessageFatal()
NVIDIA RTX 3090: @ 0x7ff1662db35f google::LogMessageFatal::~LogMessageFatal()
NVIDIA 3090: @ 0x7fa229dc935f google::LogMessageFatal::~LogMessageFatal()
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7f142971578f google::LogMessageFatal::~LogMessageFatal()
Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200
RTX 3090: @ 0x7ff7c4d3335f google::LogMessageFatal::~LogMessageFatal()
NVIDIA RTX 3090: @ 0x7efc6619835f google::LogMessageFatal::~LogMessageFatal()
NVIDIA 3090: @ 0x7f70627d835f google::LogMessageFatal::~LogMessageFatal()
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7f60d7a0478f google::LogMessageFatal::~LogMessageFatal()
Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000
RTX 3090: @ 0x7f2baa38435f google::LogMessageFatal::~LogMessageFatal()
NVIDIA RTX 3090: @ 0x7fa1401b935f google::LogMessageFatal::~LogMessageFatal()
NVIDIA 3090: @ 0x7f06ca10535f google::LogMessageFatal::~LogMessageFatal()
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7f0d9b92678f google::LogMessageFatal::~LogMessageFatal()
Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100
RTX 3090: @ 0x7fcf7a66c35f google::LogMessageFatal::~LogMessageFatal()
NVIDIA RTX 3090: @ 0x7fc8cad7235f google::LogMessageFatal::~LogMessageFatal()
NVIDIA 3090: @ 0x7fc82391235f google::LogMessageFatal::~LogMessageFatal()
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7fd59da0978f google::LogMessageFatal::~LogMessageFatal()
Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200
RTX 3090: @ 0x7f07fc1da35f google::LogMessageFatal::~LogMessageFatal()
NVIDIA RTX 3090: @ 0x7fe536ef535f google::LogMessageFatal::~LogMessageFatal()
NVIDIA 3090: @ 0x7fcccd8ca35f google::LogMessageFatal::~LogMessageFatal()
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7fe273b5878f google::LogMessageFatal::~LogMessageFatal()
Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000
RTX 3090: @ 0x7f9e6bb6a35f google::LogMessageFatal::~LogMessageFatal()
NVIDIA RTX 3090: @ 0x7f9599f2935f google::LogMessageFatal::~LogMessageFatal()
NVIDIA 3090: @ 0x7f1a673f235f google::LogMessageFatal::~LogMessageFatal()
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7f7ae429478f google::LogMessageFatal::~LogMessageFatal()
Chaos Group V-RAY This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org vrays, More Is Better Chaos Group V-RAY 5 Mode: NVIDIA RTX GPU RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 600 1200 1800 2400 3000 SE +/- 11.72, N = 3 2856 2856 2829
OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 0.84, N = 3 SE +/- 0.09, N = 3 794.6 794.3 795.2 887.9 1. (CC) gcc options: -O2 -flto -lOpenCL
OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 0.33, N = 3 SE +/- 0.43, N = 3 743.1 744.7 742.8 804.3 1. (CC) gcc options: -O2 -flto -lOpenCL
OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20K 40K 60K 80K 100K SE +/- 88.80, N = 3 SE +/- 435.51, N = 3 35227.21 35136.60 35225.40 81211.79 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 300 600 900 1200 1500 SE +/- 0.03, N = 3 SE +/- 0.23, N = 3 658.16 657.71 658.16 1408.89 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 0.02, N = 3 SE +/- 0.19, N = 3 813.42 813.45 813.47 870.62 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
ET: Legacy ETLegacy is an open-source engine evolution of Wolfenstein: Enemy Territory, a World War II era first person shooter that was released for free by Splash Damage using the id Tech 3 engine. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better ET: Legacy 2.78 Resolution: 1920 x 1080 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 3.73, N = 3 SE +/- 3.83, N = 3 652.3 654.5 659.8 794.0
OpenBenchmarking.org Frames Per Second, More Is Better ET: Legacy 2.78 Resolution: 1920 x 1200 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 2.46, N = 3 SE +/- 0.47, N = 3 645.3 659.1 657.8 811.4
OpenBenchmarking.org Frames Per Second, More Is Better ET: Legacy 2.78 Resolution: 2560 x 1440 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 7.88, N = 4 SE +/- 6.23, N = 3 640.2 647.6 643.7 799.7
OpenBenchmarking.org Frames Per Second, More Is Better ET: Legacy 2.78 Resolution: 3840 x 2160 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 140 280 420 560 700 SE +/- 6.04, N = 3 SE +/- 113.10, N = 6 648.4 638.7 635.5 219.3
FinanceBench FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.024, N = 3 6.259 6.258 6.256 2.953 1. (CXX) g++ options: -O3 -march=native -fopenmp
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare
RTX 3090: ./gromacs: 5: /cuda-build/run-gromacs: not found
NVIDIA RTX 3090: ./gromacs: 5: /cuda-build/run-gromacs: not found
NVIDIA 3090: ./gromacs: 5: /cuda-build/run-gromacs: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./gromacs: 5: /cuda-build/run-gromacs: not found
Hashcat Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: MD5 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 30000M 60000M 90000M 120000M 150000M SE +/- 78065236.25, N = 3 SE +/- 166666666.67, N = 3 71446900000 71411366667 71436200000 151433333333
OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA1 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 11000M 22000M 33000M 44000M 55000M SE +/- 32122473.96, N = 3 SE +/- 15117135.24, N = 3 22599300000 22678500000 22813100000 49722966667
OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: 7-Zip RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 500K 1000K 1500K 2000K 2500K SE +/- 4014.97, N = 3 SE +/- 4152.51, N = 3 1138500 1140100 1141500 2547700
OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA-512 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 1300M 2600M 3900M 5200M 6500M SE +/- 2179704.36, N = 3 SE +/- 1942792.95, N = 3 2887000000 2884666667 2892900000 6300233333
OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 400K 800K 1200K 1600K 2000K SE +/- 4520.08, N = 3 SE +/- 16650.59, N = 7 846100 825633 816300 1827614
OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 53.44 53.50 53.60 74.73
LeelaChessZero LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: OpenCL RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 5K 10K 15K 20K 25K SE +/- 98.83, N = 3 22739 22711 23029 1. (CXX) g++ options: -flto -pthread
Backend: OpenCL
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./lczero: line 4: ./lc0: No such file or directory
Libplacebo Libplacebo is a multimedia rendering library based on the core rendering code of the MPV player. The libplacebo benchmark relies on the Vulkan API and tests various primitives. Learn more via the OpenBenchmarking.org test page.
RTX 3090: The test quit with a non-zero exit status.
NVIDIA RTX 3090: The test quit with a non-zero exit status.
NVIDIA 3090: The test quit with a non-zero exit status.
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./libplacebo: 3: ./src/bench: not found
LuxCoreRender LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: DLSC - Acceleration: GPU NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 11.57 11.61 21.38 MIN: 11.39 / MAX: 11.73 MIN: 11.41 / MAX: 11.72 MIN: 18.47 / MAX: 21.85
Scene: DLSC - Acceleration: GPU
RTX 3090: Test failed to run.
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Danish Mood - Acceleration: GPU RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4 8 12 16 20 SE +/- 0.08, N = 3 SE +/- 0.20, N = 3 9.18 9.20 9.01 17.39 MIN: 3.53 / MAX: 10.86 MIN: 3.03 / MAX: 10.91 MIN: 3.3 / MAX: 10.77 MIN: 0.07 / MAX: 21.12
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Orange Juice - Acceleration: GPU RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4 8 12 16 20 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 10.45 10.41 10.43 17.47 MIN: 8.53 / MAX: 13.8 MIN: 8.47 / MAX: 13.76 MIN: 8.54 / MAX: 13.75 MIN: 0.39 / MAX: 25.03
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: LuxCore Benchmark - Acceleration: GPU NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.09, N = 3 11.20 11.22 18.31 MIN: 3.54 / MAX: 13.16 MIN: 3.57 / MAX: 13.17 MIN: 7.1 / MAX: 23.24
Scene: LuxCore Benchmark - Acceleration: GPU
RTX 3090: [LuxCore][5.612] PhotonGI estimated current indirect photon error: 23.06%
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Rainbow Colors and Prism - Acceleration: GPU RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 8 16 24 32 40 SE +/- 0.21, N = 3 SE +/- 0.22, N = 3 32.33 32.60 32.25 35.74 MIN: 30.03 / MAX: 34.38 MIN: 30.12 / MAX: 34.6 MIN: 29.03 / MAX: 34.55 MIN: 34.18 / MAX: 37.28
MandelGPU MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130M 260M 390M 520M 650M SE +/- 1595019.30, N = 3 SE +/- 430679.16, N = 3 481322759.6 475794831.7 472928214.8 587587462.0 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
Mixbench A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.
Backend: OpenCL - Benchmark: Integer
RTX 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
NVIDIA RTX 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
NVIDIA 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./mixbench: 3: ./mixbench-ocl-ro: not found
Backend: NVIDIA CUDA - Benchmark: Integer
RTX 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
NVIDIA RTX 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
NVIDIA 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./mixbench: 3: ./mixbench-cuda-ro: not found
Backend: OpenCL - Benchmark: Double Precision
RTX 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
NVIDIA RTX 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
NVIDIA 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./mixbench: 3: ./mixbench-ocl-ro: not found
Backend: OpenCL - Benchmark: Single Precision
RTX 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
NVIDIA RTX 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
NVIDIA 3090: ./mixbench: 3: ./mixbench-ocl-ro: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./mixbench: 3: ./mixbench-ocl-ro: not found
Backend: NVIDIA CUDA - Benchmark: Half Precision
RTX 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
NVIDIA RTX 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
NVIDIA 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./mixbench: 3: ./mixbench-cuda-ro: not found
Backend: NVIDIA CUDA - Benchmark: Double Precision
RTX 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
NVIDIA RTX 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
NVIDIA 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./mixbench: 3: ./mixbench-cuda-ro: not found
Backend: NVIDIA CUDA - Benchmark: Single Precision
RTX 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
NVIDIA RTX 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
NVIDIA 3090: ./mixbench: 3: ./mixbench-cuda-ro: not found
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./mixbench: 3: ./mixbench-cuda-ro: not found
NAMD CUDA NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. This version of the NAMD test profile uses CUDA GPU acceleration. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.0378 0.0756 0.1134 0.1512 0.189 SE +/- 0.00047, N = 3 SE +/- 0.00009, N = 3 0.12792 0.12912 0.12779 0.16788
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: mobilenet RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.963 1.926 2.889 3.852 4.815 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 4.28 4.22 4.23 3.05 MIN: 4.15 / MAX: 15.44 MIN: 4.13 / MAX: 4.48 MIN: 4.16 / MAX: 4.41 MIN: 3.02 / MAX: 4.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.4455 0.891 1.3365 1.782 2.2275 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.98 1.97 1.97 1.27 MIN: 1.94 / MAX: 6.06 MIN: 1.94 / MAX: 3.15 MIN: 1.94 / MAX: 2.26 MIN: 1.25 / MAX: 2.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.7133 1.4266 2.1399 2.8532 3.5665 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.17 2.24 2.24 1.50 MIN: 2.21 / MAX: 20.43 MIN: 2.21 / MAX: 3.37 MIN: 2.21 / MAX: 4.61 MIN: 1.47 / MAX: 2.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: shufflenet-v2 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.4365 0.873 1.3095 1.746 2.1825 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.94 1.80 1.80 1.25 MIN: 1.78 / MAX: 2.28 MIN: 1.76 / MAX: 3 MIN: 1.77 / MAX: 2.69 MIN: 1.23 / MAX: 2.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: mnasnet RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.468 0.936 1.404 1.872 2.34 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 2.08 2.08 2.08 1.32 MIN: 2.06 / MAX: 2.29 MIN: 2.05 / MAX: 3.23 MIN: 2.06 / MAX: 2.51 MIN: 1.31 / MAX: 2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: efficientnet-b0 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.7695 1.539 2.3085 3.078 3.8475 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.26 3.25 3.42 2.11 MIN: 3.23 / MAX: 3.42 MIN: 3.22 / MAX: 4.42 MIN: 3.24 / MAX: 4.26 MIN: 2.09 / MAX: 2.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: blazeface RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.2385 0.477 0.7155 0.954 1.1925 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 1.06 1.03 1.05 0.78 MIN: 1.02 / MAX: 2.12 MIN: 1 / MAX: 2.07 MIN: 1.01 / MAX: 2.18 MIN: 0.76 / MAX: 1.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: googlenet RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 2 4 6 8 10 SE +/- 0.26, N = 3 SE +/- 0.03, N = 3 5.85 6.07 5.75 2.69 MIN: 3.71 / MAX: 29.49 MIN: 3.71 / MAX: 31.52 MIN: 3.72 / MAX: 31 MIN: 2.2 / MAX: 4.95 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: vgg16 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.9405 1.881 2.8215 3.762 4.7025 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 4.17 4.18 4.17 3.68 MIN: 4.12 / MAX: 4.86 MIN: 4.12 / MAX: 11.44 MIN: 4.13 / MAX: 4.34 MIN: 3.66 / MAX: 4.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: resnet18 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.3825 0.765 1.1475 1.53 1.9125 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 1.67 1.67 1.70 1.13 MIN: 1.64 / MAX: 1.9 MIN: 1.64 / MAX: 4.75 MIN: 1.65 / MAX: 9.46 MIN: 1.11 / MAX: 1.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: alexnet RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.432 0.864 1.296 1.728 2.16 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.91 1.92 1.90 1.87 MIN: 1.86 / MAX: 9.04 MIN: 1.87 / MAX: 7.42 MIN: 1.87 / MAX: 2.6 MIN: 1.83 / MAX: 2.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: resnet50 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.801 1.602 2.403 3.204 4.005 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.55 3.55 3.56 2.81 MIN: 3.52 / MAX: 3.69 MIN: 3.52 / MAX: 3.75 MIN: 3.53 / MAX: 3.69 MIN: 2.79 / MAX: 3.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: yolov4-tiny RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 7.02 6.72 7.43 4.80 MIN: 6.34 / MAX: 30.81 MIN: 6.3 / MAX: 19.53 MIN: 6.36 / MAX: 36.74 MIN: 4.73 / MAX: 5.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: squeezenet_ssd RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 5 10 15 20 25 SE +/- 1.24, N = 3 SE +/- 0.03, N = 3 15.96 17.31 19.73 3.43 MIN: 5.77 / MAX: 36.07 MIN: 5.63 / MAX: 36.32 MIN: 7.36 / MAX: 35.55 MIN: 3.34 / MAX: 4.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: regnety_400m RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.5715 1.143 1.7145 2.286 2.8575 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 2.54 2.53 2.53 1.64 MIN: 2.5 / MAX: 3.7 MIN: 2.49 / MAX: 3.64 MIN: 2.51 / MAX: 2.76 MIN: 1.61 / MAX: 2.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ParaView This test runs ParaView benchmarks: an open-source data analytics and visualization application. Paraview describes itself as "an open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Many Spheres - Resolution: 1920 x 1080 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 40 80 120 160 200 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 93.95 94.05 93.79 174.04
OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.9 Test: Many Spheres - Resolution: 1920 x 1080 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4K 8K 12K 16K 20K SE +/- 4.72, N = 3 SE +/- 9.92, N = 3 9419.40 9429.54 9402.63 17448.87
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Many Spheres - Resolution: 1920 x 1200 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 40 80 120 160 200 SE +/- 0.06, N = 3 SE +/- 0.25, N = 3 93.91 94.28 92.13 174.29
OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.9 Test: Many Spheres - Resolution: 1920 x 1200 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4K 8K 12K 16K 20K SE +/- 5.73, N = 3 SE +/- 25.11, N = 3 9414.69 9451.90 9236.63 17473.09
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Many Spheres - Resolution: 2560 x 1440 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 40 80 120 160 200 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 93.20 93.33 93.57 173.98
OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.9 Test: Many Spheres - Resolution: 2560 x 1440 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4K 8K 12K 16K 20K SE +/- 7.77, N = 3 SE +/- 2.22, N = 3 9343.82 9356.71 9380.93 17441.98
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Many Spheres - Resolution: 3840 x 2160 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 40 80 120 160 200 SE +/- 0.06, N = 3 SE +/- 0.24, N = 3 91.46 91.94 92.03 174.52
OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.9 Test: Many Spheres - Resolution: 3840 x 2160 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4K 8K 12K 16K 20K SE +/- 6.01, N = 3 SE +/- 24.18, N = 3 9169.64 9217.23 9226.85 17496.04
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Wavelet Volume - Resolution: 1920 x 1080 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 5.50, N = 3 SE +/- 2.67, N = 3 679.67 689.81 696.47 1111.66
OpenBenchmarking.org MiVoxels / Sec, More Is Better ParaView 5.9 Test: Wavelet Volume - Resolution: 1920 x 1080 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4K 8K 12K 16K 20K SE +/- 88.00, N = 3 SE +/- 42.71, N = 3 10874.71 11036.99 11143.49 17786.53
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Wavelet Volume - Resolution: 1920 x 1200 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 4.68, N = 3 SE +/- 4.05, N = 3 634.47 643.75 656.12 1101.32
OpenBenchmarking.org MiVoxels / Sec, More Is Better ParaView 5.9 Test: Wavelet Volume - Resolution: 1920 x 1200 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4K 8K 12K 16K 20K SE +/- 74.83, N = 3 SE +/- 64.74, N = 3 10151.48 10300.04 10497.89 17621.06
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Wavelet Volume - Resolution: 2560 x 1440 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 3.11, N = 3 SE +/- 6.66, N = 3 559.05 544.98 565.60 1127.10
OpenBenchmarking.org MiVoxels / Sec, More Is Better ParaView 5.9 Test: Wavelet Volume - Resolution: 2560 x 1440 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4K 8K 12K 16K 20K SE +/- 49.80, N = 3 SE +/- 106.54, N = 3 8944.75 8719.75 9049.54 18033.64
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Wavelet Volume - Resolution: 3840 x 2160 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 4.06, N = 5 SE +/- 9.85, N = 3 384.84 376.77 381.69 1137.66
OpenBenchmarking.org MiVoxels / Sec, More Is Better ParaView 5.9 Test: Wavelet Volume - Resolution: 3840 x 2160 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 4K 8K 12K 16K 20K SE +/- 65.03, N = 5 SE +/- 157.61, N = 3 6157.36 6028.37 6107.07 18202.54
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Wavelet Contour - Resolution: 1920 x 1080 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 5.27, N = 3 SE +/- 1.60, N = 3 602.09 595.52 599.52 798.37
OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.9 Test: Wavelet Contour - Resolution: 1920 x 1080 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 2K 4K 6K 8K 10K SE +/- 54.90, N = 3 SE +/- 16.65, N = 3 6274.52 6206.04 6247.74 8320.03
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Wavelet Contour - Resolution: 1920 x 1200 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 5.24, N = 3 SE +/- 0.35, N = 3 561.32 563.05 560.03 800.73
OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.9 Test: Wavelet Contour - Resolution: 1920 x 1200 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 2K 4K 6K 8K 10K SE +/- 54.66, N = 3 SE +/- 3.68, N = 3 5849.68 5867.72 5836.22 8344.58
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Wavelet Contour - Resolution: 2560 x 1440 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 1.60, N = 3 SE +/- 1.13, N = 3 514.93 515.78 521.09 794.76
OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.9 Test: Wavelet Contour - Resolution: 2560 x 1440 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 2K 4K 6K 8K 10K SE +/- 16.63, N = 3 SE +/- 11.72, N = 3 5366.18 5374.98 5430.39 8282.33
OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.9 Test: Wavelet Contour - Resolution: 3840 x 2160 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 2.99, N = 3 SE +/- 1.68, N = 3 396.74 390.64 391.63 796.14
OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.9 Test: Wavelet Contour - Resolution: 3840 x 2160 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 2K 4K 6K 8K 10K SE +/- 31.15, N = 3 SE +/- 17.54, N = 3 4134.49 4070.88 4081.30 8296.77
PlaidML This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.
FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL
RTX 3090: Test failed to run.
NVIDIA RTX 3090: Test failed to run.
NVIDIA 3090: Test failed to run.
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ImportError: cannot import name 'Iterable' from 'collections' (/usr/lib/python3.10/collections/__init__.py)
FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL
RTX 3090: Test failed to run.
NVIDIA RTX 3090: Test failed to run.
NVIDIA 3090: Test failed to run.
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ImportError: cannot import name 'Iterable' from 'collections' (/usr/lib/python3.10/collections/__init__.py)
FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL
RTX 3090: Test failed to run.
NVIDIA RTX 3090: Test failed to run.
NVIDIA 3090: Test failed to run.
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ImportError: cannot import name 'Iterable' from 'collections' (/usr/lib/python3.10/collections/__init__.py)
FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL
RTX 3090: Test failed to run.
NVIDIA RTX 3090: Test failed to run.
NVIDIA 3090: Test failed to run.
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ImportError: cannot import name 'Iterable' from 'collections' (/usr/lib/python3.10/collections/__init__.py)
FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL
RTX 3090: Test failed to run.
NVIDIA RTX 3090: Test failed to run.
NVIDIA 3090: Test failed to run.
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ImportError: cannot import name 'Iterable' from 'collections' (/usr/lib/python3.10/collections/__init__.py)
RealSR-NCNN RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 1.2859 2.5718 3.8577 5.1436 6.4295 SE +/- 0.022, N = 3 SE +/- 0.005, N = 3 5.621 5.715 5.673 4.166
RedShift Demo This is a test of MAXON's RedShift demo build that currently requires NVIDIA GPU acceleration. Learn more via the OpenBenchmarking.org test page.
RTX 3090: The test quit with a non-zero exit status.
NVIDIA RTX 3090: The test quit with a non-zero exit status.
NVIDIA 3090: The test quit with a non-zero exit status.
RTX 4090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./redshift: 3: /usr/redshift/bin/redshiftBenchmark: not found
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 0.8647 1.7294 2.5941 3.4588 4.3235 SE +/- 0.013, N = 3 SE +/- 0.050, N = 3 3.843 3.694 3.627 2.055 1. (CXX) g++ options: -O2 -lOpenCL
SHOC Scalable HeterOgeneous Computing The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 140 280 420 560 700 SE +/- 0.21, N = 3 SE +/- 0.34, N = 3 429.08 430.35 430.28 646.06 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 6 12 18 24 30 SE +/- 0.0039, N = 3 SE +/- 0.0005, N = 3 25.4781 25.4840 25.4505 3.3663 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 600 1200 1800 2400 3000 SE +/- 0.42, N = 3 SE +/- 1.73, N = 3 2100.51 2101.08 2101.82 2787.07 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 1.03, N = 15 44.19 44.55 44.54 94.62 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 0.19, N = 3 SE +/- 0.25, N = 3 391.21 391.57 392.03 953.25 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 6K 12K 18K 24K 30K SE +/- 67.62, N = 3 SE +/- 2.47, N = 3 8102.38 8336.71 8098.46 27192.40 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20K 40K 60K 80K 100K SE +/- 297.72, N = 3 SE +/- 781.51, N = 3 40753.9 40238.8 40566.2 88132.7 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 6 12 18 24 30 SE +/- 0.0129, N = 3 SE +/- 0.0000, N = 3 26.3342 26.3282 26.3202 3.3878 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 6 12 18 24 30 SE +/- 0.0002, N = 3 SE +/- 0.0000, N = 3 27.1252 27.1252 27.0904 3.3579 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 600 1200 1800 2400 3000 SE +/- 3.07, N = 3 SE +/- 0.34, N = 3 2246.17 2240.09 2245.23 2939.59 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Unvanquished Unvanquished is a modern fork of the Tremulous first person shooter. Unvanquished is powered by the Daemon engine, a combination of the ioquake3 engine with the graphically-beautiful XreaL engine. Unvanquished supports a modern OpenGL 3 renderer and other advanced graphics features for this open-source game. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 1920 x 1080 - Effects Quality: High RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130 260 390 520 650 SE +/- 2.19, N = 3 SE +/- 1.32, N = 3 482.0 479.0 440.0 620.3
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 1920 x 1200 - Effects Quality: High RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 140 280 420 560 700 SE +/- 4.62, N = 3 SE +/- 3.34, N = 3 464.9 478.3 474.6 628.5
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 2560 x 1440 - Effects Quality: High RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130 260 390 520 650 SE +/- 2.97, N = 3 SE +/- 1.30, N = 3 469.0 467.6 475.7 618.8
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 3840 x 2160 - Effects Quality: High RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130 260 390 520 650 SE +/- 3.53, N = 3 SE +/- 1.30, N = 3 463.6 471.0 472.9 607.3
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 1920 x 1080 - Effects Quality: Ultra RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130 260 390 520 650 SE +/- 1.10, N = 3 SE +/- 0.91, N = 3 470.8 469.5 474.5 611.3
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 1920 x 1200 - Effects Quality: Ultra RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130 260 390 520 650 SE +/- 5.55, N = 3 SE +/- 0.79, N = 3 463.0 469.8 471.1 615.4
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 2560 x 1440 - Effects Quality: Ultra RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130 260 390 520 650 SE +/- 14.16, N = 12 SE +/- 0.07, N = 3 461.7 452.5 466.6 608.2
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 3840 x 2160 - Effects Quality: Ultra RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130 260 390 520 650 SE +/- 4.41, N = 3 SE +/- 1.07, N = 3 456.8 461.2 469.7 597.3
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 1920 x 1080 - Effects Quality: Medium RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 140 280 420 560 700 SE +/- 3.91, N = 3 SE +/- 5.77, N = 3 496.1 487.3 491.9 640.1
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 1920 x 1200 - Effects Quality: Medium RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 140 280 420 560 700 SE +/- 1.98, N = 3 SE +/- 2.89, N = 3 473.6 488.5 484.0 638.9
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 2560 x 1440 - Effects Quality: Medium RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 140 280 420 560 700 SE +/- 3.75, N = 3 SE +/- 0.49, N = 3 492.5 488.5 481.2 637.5
OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.52.1 Resolution: 3840 x 2160 - Effects Quality: Medium RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 140 280 420 560 700 SE +/- 2.17, N = 3 SE +/- 6.63, N = 3 490.1 485.6 483.0 636.9
ViennaCL ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 50 100 150 200 250 SE +/- 0.09, N = 3 SE +/- 0.88, N = 3 66.1 65.5 65.3 206.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 70 140 210 280 350 SE +/- 0.68, N = 3 SE +/- 1.45, N = 3 99.6 98.2 98.3 311.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 70 140 210 280 350 SE +/- 0.67, N = 3 SE +/- 2.67, N = 3 140 137 137 332 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 13 26 39 52 65 SE +/- 0.00, N = 3 SE +/- 0.06, N = 3 23.3 23.2 23.2 58.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.06, N = 3 34.9 34.7 34.3 87.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20 40 60 80 100 SE +/- 0.42, N = 3 SE +/- 0.12, N = 3 42.4 41.6 42.2 96.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20 40 60 80 100 SE +/- 0.17, N = 3 SE +/- 9.12, N = 3 68.1 67.1 67.5 93.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 30 60 90 120 150 SE +/- 0.36, N = 3 76.1 75.5 75.4 133.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20 40 60 80 100 SE +/- 0.18, N = 3 81.5 81.9 82.6 111.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20 40 60 80 100 SE +/- 0.12, N = 3 88.4 84.4 86.8 107.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 30 60 90 120 150 SE +/- 0.32, N = 3 SE +/- 0.67, N = 3 93.8 92.5 92.8 118.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 30 60 90 120 150 SE +/- 0.26, N = 3 91.9 90.7 90.8 114.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 100 200 300 400 500 SE +/- 0.88, N = 3 368 364 364 452 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 120 240 360 480 600 SE +/- 0.33, N = 3 503 501 500 575 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 100 200 300 400 500 SE +/- 0.58, N = 3 375 371 371 447 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 140 280 420 560 700 SE +/- 0.33, N = 3 606 605 608 659 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 170 340 510 680 850 SE +/- 0.00, N = 3 SE +/- 0.58, N = 3 722 722 723 771 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 160 320 480 640 800 SE +/- 0.58, N = 3 660 657 637 720 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 50 100 150 200 250 SE +/- 0.33, N = 3 239 240 237 221 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 100 200 300 400 500 SE +/- 0.33, N = 3 377 377 377 441 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 601 600 604 1160 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 300 600 900 1200 1500 SE +/- 1.67, N = 3 SE +/- 3.33, N = 3 604 605 606 1283 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 300 600 900 1200 1500 SE +/- 1.33, N = 3 SE +/- 3.33, N = 3 599 601 602 1297 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 300 600 900 1200 1500 602 601 605 1350 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
VkFFT VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 16K 32K 48K 64K 80K SE +/- 384.18, N = 9 SE +/- 489.60, N = 3 45831 44252 44147 76599 1. (CXX) g++ options: -O3
vkpeak Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp32-scalar RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 10K 20K 30K 40K 50K SE +/- 65.72, N = 3 SE +/- 61.65, N = 3 20927.99 20826.66 20960.25 44774.48
OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp32-vec4 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 13K 26K 39K 52K 65K SE +/- 69.31, N = 3 SE +/- 88.56, N = 3 27806.74 27457.90 27806.55 59114.75
OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp16-scalar RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 10K 20K 30K 40K 50K SE +/- 45.34, N = 3 SE +/- 58.62, N = 3 20949.64 20717.31 20953.72 44666.11
OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp16-vec4 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 20K 40K 60K 80K 100K SE +/- 52.39, N = 3 SE +/- 3.14, N = 3 41184.89 40926.17 41496.68 88463.01
OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp64-scalar RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 300 600 900 1200 1500 SE +/- 0.83, N = 3 SE +/- 0.03, N = 3 653.61 649.53 658.50 1409.19
OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int32-scalar RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 10K 20K 30K 40K 50K SE +/- 44.97, N = 3 SE +/- 1.91, N = 3 20769.27 20689.72 20924.98 44768.16
OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int32-vec4 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 10K 20K 30K 40K 50K SE +/- 44.80, N = 3 SE +/- 2.83, N = 3 20672.19 20594.78 20672.21 44563.49
OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int16-scalar RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 6K 12K 18K 24K 30K SE +/- 29.87, N = 3 SE +/- 36.89, N = 3 13708.58 13657.36 13709.41 29775.38
OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int16-vec4 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 8K 16K 24K 32K 40K SE +/- 2.98, N = 3 SE +/- 18.88, N = 3 16885.79 16880.01 17012.23 39651.75
VkResample VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 30 60 90 120 150 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 117.82 118.52 119.07 55.37 1. (CXX) g++ options: -O3
OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 3 6 9 12 15 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 9.272 9.292 9.291 7.760 1. (CXX) g++ options: -O3
Waifu2x-NCNN Vulkan Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.
Scale: 2x - Denoise: 3 - TAA: No
RTX 3090: Test failed to run.
NVIDIA RTX 3090: Test failed to run.
NVIDIA 3090: Test failed to run.
RTX 4090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.
OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 1920 x 1200 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 5.60, N = 3 SE +/- 4.70, N = 3 984.8 980.3 984.9 973.0
OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 2560 x 1440 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 0.35, N = 3 SE +/- 5.45, N = 3 984.7 985.7 985.5 980.1
OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 3840 x 2160 RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 0.78, N = 3 SE +/- 0.07, N = 3 960.8 980.7 978.6 985.4
Xonotic This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.2 Resolution: 3840 x 2160 - Effects Quality: Low RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 200 400 600 800 1000 SE +/- 0.55, N = 3 SE +/- 2.14, N = 3 655.14 659.91 645.85 830.02 MIN: 103 / MAX: 1300 MIN: 109 / MAX: 1303 MIN: 117 / MAX: 1283 MIN: 217 / MAX: 1735
OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.2 Resolution: 3840 x 2160 - Effects Quality: High RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 150 300 450 600 750 SE +/- 1.97, N = 3 SE +/- 3.30, N = 3 556.93 567.72 566.12 701.86 MIN: 114 / MAX: 1122 MIN: 91 / MAX: 1163 MIN: 113 / MAX: 1151 MIN: 193 / MAX: 1442
OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.2 Resolution: 3840 x 2160 - Effects Quality: Ultra RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 130 260 390 520 650 SE +/- 1.15, N = 3 SE +/- 0.80, N = 3 499.26 492.47 490.28 586.68 MIN: 117 / MAX: 913 MIN: 93 / MAX: 902 MIN: 122 / MAX: 891 MIN: 218 / MAX: 1144
OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.2 Resolution: 3840 x 2160 - Effects Quality: Ultimate RTX 3090 NVIDIA RTX 3090 NVIDIA 3090 RTX 4090 100 200 300 400 500 SE +/- 1.53, N = 3 SE +/- 0.84, N = 3 374.43 369.92 367.11 457.50 MIN: 65 / MAX: 751 MIN: 58 / MAX: 764 MIN: 65 / MAX: 749 MIN: 72 / MAX: 1128
RTX 3090 Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201016Graphics Notes: BAR1 / Visible vRAM Size: 32768 MiBOpenCL Notes: GPU Compute Cores: 10496Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 5 December 2021 18:19 by user pts.
NVIDIA RTX 3090 Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201016Graphics Notes: BAR1 / Visible vRAM Size: 32768 MiBOpenCL Notes: GPU Compute Cores: 10496Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 5 December 2021 20:14 by user pts.
NVIDIA 3090 Processor: AMD Ryzen 9 5950X 16-Core @ 3.40GHz (16 Cores / 32 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3801 BIOS), Chipset: AMD Starship/Matisse, Memory: 32GB, Disk: 1000GB Sabrent Rocket 4.0 Plus, Graphics: NVIDIA GeForce RTX 3090 24GB, Audio: NVIDIA GA102 HD Audio, Monitor: ASUS MG28U, Network: Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200
OS: Ubuntu 21.10, Kernel: 5.13.0-22-generic (x86_64), Desktop: GNOME Shell 40.5, Display Server: X Server 1.20.13, Display Driver: NVIDIA 495.44, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 11.5.100, Vulkan: 1.2.186, Compiler: GCC 11.2.0 + Clang 13.0.0-2, File-System: ext4, Screen Resolution: 3840x2160
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201016Graphics Notes: BAR1 / Visible vRAM Size: 32768 MiBOpenCL Notes: GPU Compute Cores: 10496Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 6 December 2021 04:10 by user pts.
RTX 4090 Processor: AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads), Motherboard: ASUS TUF GAMING X670E-PLUS WIFI (0613 BIOS), Chipset: AMD Device 14d8, Memory: 32GB, Disk: 2048GB XPG GAMMIX S70 BLADE + 4001GB SSD 870 QVO 4TB, Graphics: Gigabyte NVIDIA GeForce RTX 4090 24GB, Audio: NVIDIA Device 22ba, Monitor: PI-KVM Video, Network: Realtek RTL8125 2.5GbE + MEDIATEK Device 0608
OS: Ubuntu 22.04, Kernel: 5.15.0-25-generic (x86_64), Desktop: GNOME Shell 42.4, Display Server: X Server 1.21.1.3, Display Driver: NVIDIA 520.56.06, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 11.8.87, Vulkan: 1.3.205, Compiler: GCC 13.0.0 20221013 + Clang 14.0.0-1ubuntu1, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-multilib --enable-checking=release --enable-languages=c,c++ --host=x86_64-linux-gnu --target=x86_64-linux-gnu -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203Graphics Notes: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.18.00.d2OpenCL Notes: GPU Compute Cores: 16384Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 13 October 2022 05:25 by user test.