mighty-3090x2-2

AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 DESIGNARE (F5 BIOS) and NVIDIA GeForce RTX 3090 24GB on Ubuntu 20.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2303302-NE-MIGHTY30974&gru.

mighty-3090x2-2ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionRTX 3090 x 2AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads)Gigabyte TRX40 DESIGNARE (F5 BIOS)AMD Starship/Matisse256GB2048GB ADATA SX8200PNP + 3 x 2048GB SPCC M.2 PCIe SSD + 5 x 14001GB Western Digital WUH721414ALNVIDIA GeForce RTX 3090 24GBNVIDIA Device 1aef2 x Intel I210 + Intel Wi-Fi 6 AX200Ubuntu 20.045.15.0-67-generic (x86_64)X Server 1.20.11NVIDIAOpenCL 3.0 CUDA 11.6.1341.3.194GCC 9.4.0 + CUDA 11.6btrfs1024x768OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-Av3uEd/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301039- BAR1 / Visible vRAM Size: 256 MiB- Python 3.8.10- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

mighty-3090x2-2neatbench: GPUcl-mem: Copycl-mem: Readcl-mem: Writeviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tclpeak: Global Memory Bandwidthmixbench: OpenCL - Double Precisionmixbench: OpenCL - Single Precisionmixbench: NVIDIA CUDA - Half Precisionmixbench: NVIDIA CUDA - Double Precisionmixbench: NVIDIA CUDA - Single Precisionclpeak: Single-Precision Floatclpeak: Double-Precision Doubleviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: CPU BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTmixbench: OpenCL - Integermixbench: NVIDIA CUDA - Integerclpeak: Integer Compute INThashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSluxcorerender: DLSC - GPUluxcorerender: Danish Mood - GPUluxcorerender: Orange Juice - GPUluxcorerender: LuxCore Benchmark - GPUluxcorerender: Rainbow Colors and Prism - GPUlczero: OpenCLfahbench: caffe: AlexNet - NVIDIA CUDA - 100caffe: AlexNet - NVIDIA CUDA - 200caffe: AlexNet - NVIDIA CUDA - 1000caffe: GoogleNet - NVIDIA CUDA - 100caffe: GoogleNet - NVIDIA CUDA - 200caffe: GoogleNet - NVIDIA CUDA - 1000arrayfire: Conjugate Gradient OpenCLfinancebench: Black-Scholes OpenCLrodinia: OpenCL Particle FilterRTX 3090 x 23090360.3795.6749.382576461717531447113042.2717361493365599714648235370816.35512.4336828.0736773.91493.2334440.0834570.34642.6110611111010658859058558717461.9016566.7617621.661006861687504228766666721454006099166667158476727.8219.4122.6121.4861.6519173334.5374699.5691381.716874.602800.175566.1227981.11.5016.3263.898OpenBenchmarking.org

NeatBench

Acceleration: GPU

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPURTX 3090 x 270014002100280035003090

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyRTX 3090 x 280160240320400SE +/- 0.07, N = 3360.31. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadRTX 3090 x 22004006008001000SE +/- 0.99, N = 3795.61. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteRTX 3090 x 2160320480640800SE +/- 0.03, N = 3749.31. (CC) gcc options: -O2 -flto -lOpenCL

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYRTX 3090 x 22004006008001000SE +/- 26.26, N = 158251. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYRTX 3090 x 2160320480640800SE +/- 30.72, N = 157641. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTRTX 3090 x 2130260390520650SE +/- 24.85, N = 156171. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYRTX 3090 x 2400800120016002000SE +/- 59.20, N = 1517531. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYRTX 3090 x 230060090012001500SE +/- 50.51, N = 1514471. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTRTX 3090 x 22004006008001000SE +/- 58.79, N = 1511301. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NRTX 3090 x 21020304050SE +/- 2.09, N = 1542.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TRTX 3090 x 2150300450600750SE +/- 28.09, N = 157171. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYRTX 3090 x 280160240320400SE +/- 0.88, N = 33611. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYRTX 3090 x 2110220330440550SE +/- 0.67, N = 34931. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTRTX 3090 x 280160240320400SE +/- 0.88, N = 33651. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYRTX 3090 x 2130260390520650SE +/- 1.00, N = 35991. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYRTX 3090 x 21503004506007507141. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTRTX 3090 x 2140280420560700SE +/- 0.88, N = 36481. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NRTX 3090 x 250100150200250SE +/- 0.33, N = 32351. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TRTX 3090 x 280160240320400SE +/- 0.33, N = 33701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthRTX 3090 x 22004006008001000SE +/- 0.02, N = 3816.351. (CXX) g++ options: -O3

Mixbench

Backend: OpenCL - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Double PrecisionRTX 3090 x 2110220330440550SE +/- 9.27, N = 15512.431. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Single PrecisionRTX 3090 x 28K16K24K32K40KSE +/- 625.58, N = 1536828.071. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Half Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Half PrecisionRTX 3090 x 28K16K24K32K40KSE +/- 670.64, N = 1536773.911. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Double PrecisionRTX 3090 x 2110220330440550SE +/- 8.10, N = 15493.231. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Single PrecisionRTX 3090 x 27K14K21K28K35KSE +/- 586.71, N = 1534440.081. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatRTX 3090 x 27K14K21K28K35KSE +/- 119.42, N = 334570.341. (CXX) g++ options: -O3

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleRTX 3090 x 2140280420560700SE +/- 0.78, N = 3642.611. (CXX) g++ options: -O3

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNRTX 3090 x 220406080100SE +/- 0.53, N = 151061. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNRTX 3090 x 220406080100SE +/- 0.54, N = 141111. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTRTX 3090 x 220406080100SE +/- 0.48, N = 121101. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTRTX 3090 x 2204060801001061. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNRTX 3090 x 2130260390520650SE +/- 1.53, N = 35881. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTRTX 3090 x 2130260390520650SE +/- 1.53, N = 35901. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNRTX 3090 x 2130260390520650SE +/- 1.33, N = 35851. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTRTX 3090 x 21302603905206505871. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

Mixbench

Backend: OpenCL - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: IntegerRTX 3090 x 24K8K12K16K20KSE +/- 6.05, N = 317461.901. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: IntegerRTX 3090 x 24K8K12K16K20KSE +/- 323.63, N = 1516566.761. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTRTX 3090 x 24K8K12K16K20KSE +/- 0.16, N = 317621.661. (CXX) g++ options: -O3

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5RTX 3090 x 220000M40000M60000M80000M100000MSE +/- 8558826478.48, N = 16100686168750

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1RTX 3090 x 29000M18000M27000M36000M45000MSE +/- 53718567.04, N = 342287666667

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipRTX 3090 x 2500K1000K1500K2000K2500KSE +/- 1700.98, N = 32145400

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512RTX 3090 x 21300M2600M3900M5200M6500MSE +/- 13877599.86, N = 36099166667

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSRTX 3090 x 2300K600K900K1200K1500KSE +/- 3668.48, N = 31584767

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: GPURTX 3090 x 2714212835SE +/- 0.16, N = 327.82MIN: 25.71 / MAX: 28.34

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Danish Mood - Acceleration: GPURTX 3090 x 2510152025SE +/- 0.06, N = 319.41MIN: 8.01 / MAX: 22.62

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: GPURTX 3090 x 2510152025SE +/- 0.23, N = 522.61MIN: 19.29 / MAX: 32

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: GPURTX 3090 x 2510152025SE +/- 0.02, N = 321.48MIN: 8.67 / MAX: 26.83

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Rainbow Colors and Prism - Acceleration: GPURTX 3090 x 21428425670SE +/- 0.30, N = 361.65MIN: 53.45 / MAX: 71.5

LeelaChessZero

Backend: OpenCL

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: OpenCLRTX 3090 x 24K8K12K16K20KSE +/- 197.85, N = 4191731. (CXX) g++ options: -flto -pthread

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2RTX 3090 x 270140210280350SE +/- 0.26, N = 3334.54

Caffe

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100RTX 3090 x 2150300450600750SE +/- 0.44, N = 3699.571. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200RTX 3090 x 230060090012001500SE +/- 3.10, N = 31381.711. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000RTX 3090 x 215003000450060007500SE +/- 8.67, N = 36874.601. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100RTX 3090 x 26001200180024003000SE +/- 14.32, N = 32800.171. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200RTX 3090 x 212002400360048006000SE +/- 2.52, N = 35566.121. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000RTX 3090 x 26K12K18K24K30KSE +/- 96.41, N = 327981.11. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

ArrayFire

Test: Conjugate Gradient OpenCL

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.7Test: Conjugate Gradient OpenCLRTX 3090 x 20.33770.67541.01311.35081.6885SE +/- 0.004, N = 31.5011. (CXX) g++ options: -rdynamic

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLRTX 3090 x 2246810SE +/- 0.002, N = 36.3261. (CXX) g++ options: -O3 -march=native -fopenmp

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle FilterRTX 3090 x 20.87711.75422.63133.50844.3855SE +/- 0.030, N = 103.8981. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl


Phoronix Test Suite v10.8.5