newnew

Intel Core i7-1165G7 testing with a Dell 0GG9PT (3.15.0 BIOS) and Intel Xe TGL GT2 15GB on Ubuntu 23.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2308049-NE-NEWNEW95665&grt&sor.

newnewProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcIntel Core i7-1165G7 @ 4.70GHz (4 Cores / 8 Threads)Dell 0GG9PT (3.15.0 BIOS)Intel Tiger Lake-LP16GBKioxia KBG40ZNS256G NVMe 256GBIntel Xe TGL GT2 15GB (1300MHz)Realtek ALC289Intel Wi-Fi 6 AX201Ubuntu 23.046.2.0-24-generic (x86_64)GNOME Shell 44.0X Server + Wayland4.6 Mesa 23.0.2GCC 12.2.0ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xa6 - Thermald 2.5.2 Java Details- OpenJDK Runtime Environment (build 11.0.19+7-post-Ubuntu-0ubuntu123.04)Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

newnewcassandra: Writesapache-iotdb: 100 - 1 - 200apache-iotdb: 100 - 1 - 200apache-iotdb: 100 - 1 - 500apache-iotdb: 100 - 1 - 500apache-iotdb: 200 - 1 - 200apache-iotdb: 200 - 1 - 200apache-iotdb: 200 - 1 - 500apache-iotdb: 200 - 1 - 500apache-iotdb: 100 - 100 - 200apache-iotdb: 100 - 100 - 200apache-iotdb: 100 - 100 - 500apache-iotdb: 100 - 100 - 500apache-iotdb: 200 - 100 - 200apache-iotdb: 200 - 100 - 200apache-iotdb: 200 - 100 - 500apache-iotdb: 200 - 100 - 500brl-cad: VGR Performance Metricdragonflydb: 10 - 1:5dragonflydb: 20 - 1:5dragonflydb: 50 - 1:5dragonflydb: 10 - 1:10dragonflydb: 20 - 1:10dragonflydb: 50 - 1:10dragonflydb: 10 - 1:100dragonflydb: 20 - 1:100dragonflydb: 50 - 1:100ncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetbuild-gcc: Time To Compilevkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingvkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4vkresample: 2x - Singlevvenc: Bosphorus 4K - Fastvvenc: Bosphorus 4K - Fastervvenc: Bosphorus 1080p - Fastvvenc: Bosphorus 1080p - Fasterabc39864640563.2216.781265418.9124.981035973.61121696504.0222.0822588608.6768.4821059562.6211.9617866468.2691.6716383690.96284.41520081331051.901620397.171509161.501283252.401553942.491559408.511267626.251504800.261590238.4320.744.603.584.034.558.851.3215.8259.0911.178.6128.7429.4412.9511.76189.403.9721.074.663.603.503.936.991.2615.8659.0311.188.6828.8429.6013.0511.19207.463.972404.7745585142461033748649448176934.741478.582309.293182.23474.88493.63907.91979.25100.0091.6543.6805.1412.7964082364483316.41124779625.141007122.6212.521651562.3623.121253793.5374.0821812047.21200.9718637109.8385.6410851137.82370.36519671310891.481572424.891546553.321275557.391532531.221545294.471305698.481532061.631538729.6120.744.603.593.473.866.920.9412.7854.259.257.4326.5729.0312.238.61204.363.9120.654.633.593.493.896.980.9914.0957.8810.228.0828.7129.2313.018.57193.903.942404.3205589142321034748249378176935.021478.502309.193182.01474.86493.63907.82979.3100.0101.6503.7815.12412.63739536655887.0316.031266283.9424.73995398.4312.781659495.2922.6317612257.1376.878804114.19499.989365385.68164.995724897.6757.34518801394989.581561730.491559859.231325104.131620928.161574780.451297415.251655250.861548401.5420.744.533.513.473.836.640.9612.4254.139.197.3026.5729.2512.058.39200.233.8920.694.603.543.493.876.811.0814.0158.2810.408.1528.7829.1113.009.62188.794.032394.1395688142411035747850878183934.811479.042309.283182.29474.89493.65907.87979.23100.0091.6773.8535.24913.802OpenBenchmarking.org

Apache Cassandra

Test: Writes

OpenBenchmarking.orgOp/s, More Is BetterApache Cassandra 4.1.3Test: Writesbac9K18K27K36K45KSE +/- 1520.50, N = 2SE +/- 2472.50, N = 2SE +/- 2294.50, N = 2408233986439536

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200cba140K280K420K560K700K655887.03644833.00640563.22

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200abc4812162016.7816.4116.03MAX: 906.87MAX: 1017.33MAX: 1027.73

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500cab300K600K900K1200K1500K1266283.941265418.911247796.00

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500bac61218243025.1424.9824.73MAX: 1026.24MAX: 1064.56MAX: 998.23

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200abc200K400K600K800K1000K1035973.611007122.62995398.43

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200cba369121512.7812.5212.00MAX: 800.68MAX: 794.23MAX: 771.24

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500acb400K800K1200K1600K2000K1696504.021659495.291651562.36

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500bca61218243023.1022.6322.08MAX: 906.47MAX: 909.27MAX: 863.43

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200abc5M10M15M20M25M22588608.6721253793.5317612257.13

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200cba2040608010076.8774.0868.48MAX: 9168.34MAX: 1547.17MAX: 1696.56

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500bac5M10M15M20M25M21812047.2121059562.608804114.19

Apache IoTDB

Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500cab110220330440550499.98211.96200.97MAX: 7223.32MAX: 1593.95MAX: 1387.04

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200bac4M8M12M16M20M18637109.8317866468.269365385.68

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200cab4080120160200164.9991.6785.64MAX: 11718.16MAX: 7883.93MAX: 1651.27

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500abc4M8M12M16M20M16383690.9610851137.825724897.60

Apache IoTDB

Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500cba160320480640800757.34370.36284.41MAX: 21117.8MAX: 7948.19MAX: 2551.38

BRL-CAD

VGR Performance Metric

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.36VGR Performance Metricabc11K22K33K44K55KSE +/- 16.50, N = 2SE +/- 147.00, N = 2SE +/- 34.00, N = 25200851967518801. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6

Dragonflydb

Clients Per Thread: 10 - Set To Get Ratio: 1:5

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 10 - Set To Get Ratio: 1:5cab300K600K900K1200K1500KSE +/- 156954.29, N = 2SE +/- 124523.27, N = 2SE +/- 115490.54, N = 21394989.581331051.901310891.481. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Dragonflydb

Clients Per Thread: 20 - Set To Get Ratio: 1:5

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 20 - Set To Get Ratio: 1:5abc300K600K900K1200K1500KSE +/- 138897.05, N = 2SE +/- 146499.45, N = 2SE +/- 157442.25, N = 21620397.171572424.891561730.491. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Dragonflydb

Clients Per Thread: 50 - Set To Get Ratio: 1:5

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 50 - Set To Get Ratio: 1:5cba300K600K900K1200K1500KSE +/- 148775.44, N = 2SE +/- 81765.99, N = 2SE +/- 35029.57, N = 21559859.231546553.321509161.501. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Dragonflydb

Clients Per Thread: 10 - Set To Get Ratio: 1:10

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 10 - Set To Get Ratio: 1:10cab300K600K900K1200K1500KSE +/- 143882.01, N = 2SE +/- 119195.59, N = 2SE +/- 90004.70, N = 21325104.131283252.401275557.391. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Dragonflydb

Clients Per Thread: 20 - Set To Get Ratio: 1:10

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 20 - Set To Get Ratio: 1:10cab300K600K900K1200K1500KSE +/- 153108.99, N = 2SE +/- 93795.63, N = 2SE +/- 144264.17, N = 21620928.161553942.491532531.221. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Dragonflydb

Clients Per Thread: 50 - Set To Get Ratio: 1:10

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 50 - Set To Get Ratio: 1:10cab300K600K900K1200K1500KSE +/- 128692.32, N = 2SE +/- 99951.62, N = 2SE +/- 111709.66, N = 21574780.451559408.511545294.471. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Dragonflydb

Clients Per Thread: 10 - Set To Get Ratio: 1:100

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 10 - Set To Get Ratio: 1:100bca300K600K900K1200K1500KSE +/- 95110.31, N = 2SE +/- 141978.48, N = 2SE +/- 117619.34, N = 21305698.481297415.251267626.251. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Dragonflydb

Clients Per Thread: 20 - Set To Get Ratio: 1:100

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 20 - Set To Get Ratio: 1:100cba400K800K1200K1600K2000KSE +/- 168860.22, N = 2SE +/- 71249.46, N = 2SE +/- 66621.90, N = 21655250.861532061.631504800.261. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Dragonflydb

Clients Per Thread: 50 - Set To Get Ratio: 1:100

OpenBenchmarking.orgOps/sec, More Is BetterDragonflydb 1.6.2Clients Per Thread: 50 - Set To Get Ratio: 1:100acb300K600K900K1200K1500KSE +/- 113274.82, N = 2SE +/- 65233.84, N = 2SE +/- 98659.37, N = 21590238.431548401.541538729.611. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mobilenetabc510152025SE +/- 0.05, N = 2SE +/- 0.04, N = 2SE +/- 0.07, N = 220.7420.7420.74MIN: 20.17 / MAX: 31.42MIN: 20.23 / MAX: 31.9MIN: 20.26 / MAX: 32.281. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v2-v2 - Model: mobilenet-v2cab1.0352.073.1054.145.175SE +/- 0.02, N = 2SE +/- 0.04, N = 2SE +/- 0.02, N = 24.534.604.60MIN: 4.3 / MAX: 15.17MIN: 4.38 / MAX: 12.37MIN: 4.34 / MAX: 14.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3 - Model: mobilenet-v3cab0.80781.61562.42343.23124.039SE +/- 0.00, N = 2SE +/- 0.03, N = 2SE +/- 0.04, N = 23.513.583.59MIN: 3.32 / MAX: 11.83MIN: 3.35 / MAX: 13.78MIN: 3.37 / MAX: 14.171. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: shufflenet-v2bca0.90681.81362.72043.62724.534SE +/- 0.01, N = 2SE +/- 0.02, N = 2SE +/- 0.57, N = 23.473.474.03MIN: 3.32 / MAX: 12.41MIN: 3.36 / MAX: 13.07MIN: 3.35 / MAX: 10.891. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mnasnetcba1.02382.04763.07144.09525.119SE +/- 0.01, N = 2SE +/- 0.07, N = 2SE +/- 0.63, N = 23.833.864.55MIN: 3.68 / MAX: 12.77MIN: 3.69 / MAX: 12.48MIN: 3.81 / MAX: 15.971. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: efficientnet-b0cba246810SE +/- 0.05, N = 2SE +/- 0.04, N = 2SE +/- 0.29, N = 26.646.928.85MIN: 6.38 / MAX: 17.34MIN: 6.5 / MAX: 16.04MIN: 6.6 / MAX: 21.281. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: blazefacebca0.2970.5940.8911.1881.485SE +/- 0.02, N = 2SE +/- 0.03, N = 2SE +/- 0.03, N = 20.940.961.32MIN: 0.9 / MAX: 1.09MIN: 0.9 / MAX: 5.52MIN: 1.2 / MAX: 4.081. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: googlenetcba48121620SE +/- 0.27, N = 2SE +/- 0.20, N = 2SE +/- 0.07, N = 212.4212.7815.82MIN: 11.68 / MAX: 23.37MIN: 11.9 / MAX: 22.46MIN: 15.12 / MAX: 28.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vgg16cba1326395265SE +/- 2.37, N = 2SE +/- 1.96, N = 2SE +/- 0.11, N = 254.1354.2559.09MIN: 48.55 / MAX: 72.36MIN: 50.32 / MAX: 71.7MIN: 57.54 / MAX: 76.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet18cba3691215SE +/- 0.50, N = 2SE +/- 0.11, N = 2SE +/- 0.04, N = 29.199.2511.17MIN: 8.38 / MAX: 20.09MIN: 8.51 / MAX: 24.74MIN: 10.66 / MAX: 21.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: alexnetcba246810SE +/- 0.25, N = 2SE +/- 0.09, N = 2SE +/- 0.04, N = 27.307.438.61MIN: 6.79 / MAX: 17.09MIN: 6.9 / MAX: 16.46MIN: 8.27 / MAX: 20.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet50bca714212835SE +/- 2.23, N = 2SE +/- 2.20, N = 2SE +/- 0.18, N = 226.5726.5728.74MIN: 23.28 / MAX: 38.91MIN: 23.37 / MAX: 38.7MIN: 27.89 / MAX: 401. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: yolov4-tinybca714212835SE +/- 0.03, N = 2SE +/- 0.03, N = 2SE +/- 0.42, N = 229.0329.2529.44MIN: 28.45 / MAX: 40.14MIN: 28.42 / MAX: 40.16MIN: 28.43 / MAX: 46.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: squeezenet_ssdcba3691215SE +/- 0.97, N = 2SE +/- 0.86, N = 2SE +/- 0.03, N = 212.0512.2312.95MIN: 10.72 / MAX: 23.89MIN: 10.77 / MAX: 31.43MIN: 12.67 / MAX: 23.31. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: regnety_400mcba3691215SE +/- 0.01, N = 2SE +/- 0.03, N = 2SE +/- 0.10, N = 28.398.6111.76MIN: 8.14 / MAX: 18.42MIN: 8.21 / MAX: 17.64MIN: 11.26 / MAX: 22.161. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vision_transformeracb4080120160200SE +/- 4.42, N = 2SE +/- 5.48, N = 2SE +/- 3.85, N = 2189.40200.23204.36MIN: 169.61 / MAX: 225.69MIN: 169.16 / MAX: 234.64MIN: 168.8 / MAX: 231.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: FastestDetcba0.89331.78662.67993.57324.4665SE +/- 0.03, N = 2SE +/- 0.00, N = 2SE +/- 0.00, N = 23.893.913.97MIN: 3.67 / MAX: 13.73MIN: 3.71 / MAX: 13.7MIN: 3.76 / MAX: 14.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetbca510152025SE +/- 0.03, N = 2SE +/- 0.03, N = 2SE +/- 0.19, N = 220.6520.6921.07MIN: 20.22 / MAX: 31.66MIN: 20.24 / MAX: 31.31MIN: 20.27 / MAX: 31.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2cba1.04852.0973.14554.1945.2425SE +/- 0.04, N = 2SE +/- 0.03, N = 2SE +/- 0.00, N = 24.604.634.66MIN: 4.38 / MAX: 14MIN: 4.44 / MAX: 14.09MIN: 4.46 / MAX: 14.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3cba0.811.622.433.244.05SE +/- 0.01, N = 2SE +/- 0.02, N = 2SE +/- 0.03, N = 23.543.593.60MIN: 3.33 / MAX: 11.68MIN: 3.4 / MAX: 12.21MIN: 3.38 / MAX: 12.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2bca0.78751.5752.36253.153.9375SE +/- 0.01, N = 2SE +/- 0.02, N = 2SE +/- 0.01, N = 23.493.493.50MIN: 3.33 / MAX: 13.62MIN: 3.35 / MAX: 11.83MIN: 3.35 / MAX: 12.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetcba0.88431.76862.65293.53724.4215SE +/- 0.04, N = 2SE +/- 0.03, N = 2SE +/- 0.01, N = 23.873.893.93MIN: 3.71 / MAX: 12.28MIN: 3.74 / MAX: 12.97MIN: 3.73 / MAX: 14.411. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0cba246810SE +/- 0.15, N = 2SE +/- 0.02, N = 2SE +/- 0.02, N = 26.816.986.99MIN: 6.42 / MAX: 17.77MIN: 6.57 / MAX: 17.79MIN: 6.55 / MAX: 16.311. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefacebca0.28350.5670.85051.1341.4175SE +/- 0.05, N = 2SE +/- 0.14, N = 2SE +/- 0.02, N = 20.991.081.26MIN: 0.91 / MAX: 3.77MIN: 0.91 / MAX: 3.75MIN: 1.16 / MAX: 4.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetcba48121620SE +/- 1.79, N = 2SE +/- 1.53, N = 2SE +/- 0.02, N = 214.0114.0915.86MIN: 11.83 / MAX: 25.65MIN: 12.03 / MAX: 25.74MIN: 15.16 / MAX: 26.571. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16bca1326395265SE +/- 1.40, N = 2SE +/- 0.65, N = 2SE +/- 0.15, N = 257.8858.2859.03MIN: 52 / MAX: 70.84MIN: 52.14 / MAX: 69.93MIN: 57.6 / MAX: 79.491. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18bca3691215SE +/- 1.05, N = 2SE +/- 0.82, N = 2SE +/- 0.02, N = 210.2210.4011.18MIN: 8.49 / MAX: 20.91MIN: 8.57 / MAX: 21.1MIN: 10.62 / MAX: 21.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetbca246810SE +/- 0.52, N = 2SE +/- 0.55, N = 2SE +/- 0.04, N = 28.088.158.68MIN: 7.01 / MAX: 17.55MIN: 7.01 / MAX: 17.98MIN: 8.25 / MAX: 19.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50bca714212835SE +/- 0.12, N = 2SE +/- 0.15, N = 2SE +/- 0.19, N = 228.7128.7828.84MIN: 27.9 / MAX: 39.22MIN: 27.96 / MAX: 39.21MIN: 27.92 / MAX: 41.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinycba714212835SE +/- 0.05, N = 2SE +/- 0.15, N = 2SE +/- 0.26, N = 229.1129.2329.60MIN: 28.38 / MAX: 40.39MIN: 28.34 / MAX: 41.02MIN: 28.6 / MAX: 44.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdcba3691215SE +/- 0.03, N = 2SE +/- 0.01, N = 2SE +/- 0.04, N = 213.0013.0113.05MIN: 12.68 / MAX: 23.98MIN: 12.67 / MAX: 27.85MIN: 12.7 / MAX: 23.561. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mbca3691215SE +/- 0.01, N = 2SE +/- 1.13, N = 2SE +/- 0.30, N = 28.579.6211.19MIN: 8.18 / MAX: 21.08MIN: 8.23 / MAX: 23.19MIN: 10.38 / MAX: 21.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformercba50100150200250SE +/- 0.68, N = 2SE +/- 0.73, N = 2SE +/- 8.01, N = 2188.79193.90207.46MIN: 169.1 / MAX: 243.33MIN: 167.64 / MAX: 231.56MIN: 169.78 / MAX: 244.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetbac0.90681.81362.72043.62724.534SE +/- 0.00, N = 2SE +/- 0.09, N = 2SE +/- 0.10, N = 23.943.974.03MIN: 3.73 / MAX: 12.78MIN: 3.7 / MAX: 15.04MIN: 3.74 / MAX: 10.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Timed GCC Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed GCC Compilation 13.2Time To Compilecba5001000150020002500SE +/- 1.40, N = 2SE +/- 2.12, N = 2SE +/- 0.32, N = 22394.142404.322404.77

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2Rcba12002400360048006000SE +/- 69.00, N = 2SE +/- 30.50, N = 2SE +/- 3.50, N = 25688558955851. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in half precisionacb3K6K9K12K15KSE +/- 5.00, N = 2SE +/- 4.00, N = 2SE +/- 1.50, N = 21424614241142321. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein in single precisioncba2004006008001000SE +/- 1.00, N = 2SE +/- 1.00, N = 2SE +/- 0.50, N = 21035103410331. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precisionabc16003200480064008000SE +/- 14.00, N = 2SE +/- 4.00, N = 2SE +/- 3.50, N = 27486748274781. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C multidimensional in single precisioncab11002200330044005500SE +/- 9.00, N = 2SE +/- 17.50, N = 2SE +/- 6.50, N = 25087494449371. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingcba2K4K6K8K10KSE +/- 5.00, N = 2SE +/- 0.50, N = 2SE +/- 4.00, N = 28183817681761. (CXX) g++ options: -O3

vkpeak

fp32-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-scalarbca2004006008001000SE +/- 0.00, N = 2SE +/- 0.29, N = 2SE +/- 0.27, N = 2935.02934.81934.74

vkpeak

fp32-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-vec4cab30060090012001500SE +/- 0.00, N = 2SE +/- 0.52, N = 2SE +/- 0.51, N = 21479.041478.581478.50

vkpeak

fp16-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-scalaracb5001000150020002500SE +/- 0.07, N = 2SE +/- 0.04, N = 2SE +/- 0.02, N = 22309.292309.282309.19

vkpeak

fp16-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-vec4cab7001400210028003500SE +/- 0.12, N = 2SE +/- 0.11, N = 2SE +/- 0.04, N = 23182.293182.233182.01

vkpeak

int32-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-scalarcab100200300400500SE +/- 0.00, N = 2SE +/- 0.03, N = 2SE +/- 0.00, N = 2474.89474.88474.86

vkpeak

int32-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-vec4cba110220330440550SE +/- 0.00, N = 2SE +/- 0.00, N = 2SE +/- 0.02, N = 2493.65493.63493.63

vkpeak

int16-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-scalaracb2004006008001000SE +/- 0.10, N = 2SE +/- 0.03, N = 2SE +/- 0.01, N = 2907.91907.87907.82

vkpeak

int16-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-vec4bac2004006008001000SE +/- 0.05, N = 2SE +/- 0.08, N = 2SE +/- 0.00, N = 2979.30979.25979.23

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Singleacb20406080100SE +/- 0.00, N = 2SE +/- 0.00, N = 2SE +/- 0.00, N = 2100.01100.01100.011. (CXX) g++ options: -O3

VVenC

Video Input: Bosphorus 4K - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 4K - Video Preset: Fastcab0.37730.75461.13191.50921.8865SE +/- 0.036, N = 2SE +/- 0.037, N = 2SE +/- 0.028, N = 21.6771.6541.6501. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 4K - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 4K - Video Preset: Fastercba0.86691.73382.60073.46764.3345SE +/- 0.081, N = 2SE +/- 0.014, N = 2SE +/- 0.014, N = 23.8533.7813.6801. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 1080p - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 1080p - Video Preset: Fastcab1.1812.3623.5434.7245.905SE +/- 0.115, N = 2SE +/- 0.000, N = 2SE +/- 0.008, N = 25.2495.1405.1241. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 1080p - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 1080p - Video Preset: Fastercab48121620SE +/- 0.79, N = 2SE +/- 0.08, N = 2SE +/- 0.11, N = 213.8012.8012.641. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects


Phoronix Test Suite v10.8.5