Run and Profiling Commands¶
run¶
Runs inference using the DNN file model. It's possible to get inference results and execution times.
Usage
usage: softneuro run [-o ONPY]... [-p PASSWORD] [--recipe RECIPE][--batch BATCH]
[--ishape SHAPE] [--keep_img_ar PADDINGCOLOR]
[--img_resize_mode RESIZEMODE] [--thread NTHREADS] [--noboost]
[--affinity MASK[@THREAD_INDICES]]
[-r ROUTINE[@LAYER_INDICES]] [-R RPARAMS[@LAYER_INDICES]] [-nobufopt DEVICE]
[--lib LIB] [-l LNUM] [--detail] [--detail2] [--bylayer]
[-dump DUMPDIR] [-dump2 DUMPRID] [-t NTOPS] [-h]
DNN [INPUT [INPUT ...]]
Arguments
Argument | Description |
---|---|
DNN | DNN file for inference execution. |
INPUT | Input for inference execution. Can be a numpy file or an image file. If not provided, input will be uniform random numbers from [-1, 1]. |
Flags
Flag | Description |
---|---|
-p PASSWORD --pass PASSWORD | Password to run an encrypted DNN file. |
-o ONPY | File name to output the inference results as a numpy file. |
--recipe RECIPE | Set dnn recipe file. |
--batch BATCH | Input batch size. |
--ishape SHAPE | Input shape. (Example: 1x224x224x3 ). |
--keep_img_ar PADDINGCOLOR | Keeps aspect ratio when resizing input image. The aspect ratio is not kept by default. Margin space is filled with the color specified by PADDINGCOLOR. PADDINGCOLOR can be specified by RGB value, for example, '0, 0, 0'. |
--img_resize_mode RESIZEMODE | Specifies the resizing mode. 'bilinear' or 'nearest' can be specified. Default is 'bilinear'. |
--thread NTHREADS | How many threads should be used for execution. Defaults to the number of CPU cores. |
--noboost | Set threads as cond wait when they're waiting for a task. If this isn't set, threads will be set to busy wait. |
--affinity MASK[@THREAD_INDICES] | Use the affinity mask given by MASK on the threads given byTHREAD_INDICES . MASK should be a little endian hexadecimal (0x..), binary (0b..), or decimal number. If THREAD_INDICES isn't set all threads will use the given mask. For more information on THREAD_INDICES use the softneuro help thread_indices command. |
-r, --routine ROUTINE[@LAYER_INDICES] | Set routines to be used. If not set, the usually best available routines will be chosen (e.g. CUDA if there's CUDA support). The default is cpu. If the model is tuned this setting is ignored. If LAYER_INDICES isn't set all layers in main net will be applied. For more information on LAYER_INDICES use the softneuro help layer_indices command. |
-R RPARAMS[@LAYER_INDICES], --rparams RPARAMS[@LAYER_INDICES] | Set routine parameters to be used. If the model is tuned this setting is ignored. If LAYER_INDICES isn't set all layers in main net will be applied. For more information on LAYER_INDICES use the softneuro help layer_indices command. |
--nobufopt DEVICE | Disable the buffer optimizer for routines that run on the given device. |
--lib LIB | Set an OpenCL binary file when using online compilation. |
-l, --loop LNUM | Run inference LNUM times for benchmarking. |
--detail | Show detailed inference statistics. |
--detail2 | Show even more detailed inference statistics. If a layer is made up of other layers this shows the processing times of the internal layers as well. |
--bylayer | Show detailed inference statistics by layer. |
--dump DUMPDIR | Dump each layer output as a numpy file in the given folder. |
--dump2 DUMPDIR | Dump each layer output and internal layer outputs if they exist in the given folder. |
-t, --top NTOPS | Shows the top TOP scores and labels for image classification models. |
-h, --help | Shows the command help. |
Example
$ softneuro run densenet121.dnn --thread 8 --affinity 0xf0@0..3 --affinity 0x0f@4..7 --top 5 --loop 10 shovel.jpg
---------------------------------
Top 5 Labels
---------------------------------
# SCORE LABEL
1 0.9999 shovel
2 0.0001 hatchet
3 0.0000 broom
4 0.0000 swab
5 0.0000 spatula
---------------------------------
Statistics
---------------------------------
FUNCTION AVE(us) MIN(us) MAX(us) #RUN
Dnn_load() 43,070 43,070 43,070 1
Dnn_compile() 28,567 28,567 28,567 1
Dnn_forward() 39,877 39,751 39,983 10
Used memory: 88,403,968 Bytes
---------------------------------
Benchmark
---------------------------------
preprocess: 81 68 68 69 70 64 71 69 70 68
main: 39872 39698 39710 39679 39896 39884 39770 39801 39908 39824
TOTAL: 39955 39767 39778 39749 39968 39949 39842 39873 39981 39894
The inference time is given by Dnn_forward
.
AVE, MIN and MAX are, respectively, the average, minimum and maximum execution times for the number of runs shown under #RUN.
init¶
Initializes profiling data.
Usage
usage: softneuro init [--thread NTHREADS] [--affinity MASK[@THREAD_INDICES]]
[--pass PASSWORD] [--help]
PROF DNN
Arguments
Argument | Description |
---|---|
PROF | Directory where the profiling data will be initialized. |
DNN | DNN file to be profiled. |
Flags
Flag | Description |
---|---|
--thread TNUM | How many threads should be used for execution. Defaults to the number of CPU cores. |
--affinity MASK[@THREAD_INDICES] | Use the affinity mask given by MASK on the threads given byTHREAD_INDICES . MASK should be a little endian hexadecimal (0x..), binary (0b..), or decimal number. If THREAD_INDICES isn't set all threads will use the given mask. For more information on THREAD_INDICES use the softneuro help thread_indices command. |
--pass PASSWORD | Password to profile an encrypted DNN file. |
-h, --help | Shows the command help. |
Example
The command creates the mobilnet_prof directory with profiling data.
※There's no terminal output
$ softneuro init mobilenet_prof mobilenet.dnn
add¶
Configure routines to layers present in the profiling data.
Usage
usage: softneuro add [--dnn DNN] [--pass PASSWORD] [--ref REF] [--ref-pass REF_PASSWORD] [--help]
PROF [ROUTINE[@LAYER_INDICES]]...
Arguments
Argument | Description |
---|---|
PROF | Directory containing profiling data. |
ROUTINE[@LAYER_INDICES] | Set the routine given by ROUTINE to the layers given by LAYER_INDICES . If LAYER_INDICES isn't set, the routine will be set to all layers in the main network. The ROUTINE format can be checked with the softneuro help routine_desc command, and the LAYER_INDICES format can be checked with the softneuro help layer_indices command. |
Flags
Flag | Description |
---|---|
--dnn DNN | A DNN file used to create profiling data. |
-p, --pass PASSWORD | The password required to use the prof file. |
--ref REF | the reference dnn file when profiling a secret dnn. |
--ref-pass REF_PASSWORD | the password for REF. |
-h, --help | Shows the command help. |
Example
Set all main network layers to use the cpu:qint8
routine, if supported.
$ softneuro add mobilenet_prof cpu:qint8
adding routines...done.
$ softneuro status mobilenet_prof
[preprocess]
# NAME ROUTINE TIME DESC PARAMS
0 ? (source)
1 ? (madd)
2 ? (sink)
?
[main]
# NAME ROUTINE TIME DESC PARAMS
0 input_1 (source)
1 conv1 (conv2) cpu:qint8 (9)
2 conv_dw_1 (depthwise_conv2) cpu:qint8 (3)
3 conv_pw_1 (conv2) cpu:qint8 (9)
4 conv_dw_2 (depthwise_conv2) cpu:qint8 (3)
5 conv_pw_2 (conv2) cpu:qint8 (9)
6 conv_dw_3 (depthwise_conv2) cpu:qint8 (3)
7 conv_pw_3 (conv2) cpu:qint8 (9)
8 conv_dw_4 (depthwise_conv2) cpu:qint8 (3)
9 conv_pw_4 (conv2) cpu:qint8 (9)
10 conv_dw_5 (depthwise_conv2) cpu:qint8 (3)
11 conv_pw_5 (conv2) cpu:qint8 (9)
12 conv_dw_6 (depthwise_conv2) cpu:qint8 (3)
13 conv_pw_6 (conv2) cpu:qint8 (9)
14 conv_dw_7 (depthwise_conv2) cpu:qint8 (3)
15 conv_pw_7 (conv2) cpu:qint8 (9)
16 conv_dw_8 (depthwise_conv2) cpu:qint8 (3)
17 conv_pw_8 (conv2) cpu:qint8 (9)
18 conv_dw_9 (depthwise_conv2) cpu:qint8 (3)
19 conv_pw_9 (conv2) cpu:qint8 (9)
20 conv_dw_10 (depthwise_conv2) cpu:qint8 (3)
21 conv_pw_10 (conv2) cpu:qint8 (9)
22 conv_dw_11 (depthwise_conv2) cpu:qint8 (3)
23 conv_pw_11 (conv2) cpu:qint8 (9)
24 conv_dw_12 (depthwise_conv2) cpu:qint8 (3)
25 conv_pw_12 (conv2) cpu:qint8 (9)
26 conv_dw_13 (depthwise_conv2) cpu:qint8 (3)
27 conv_pw_13 (conv2) cpu:qint8 (9)
28 global_average_pooling2d_1 (global_average_pool)
29 reshape_1 (reshape) cpu:qint8 (1)
30 conv_preds (conv2) cpu:qint8 (9)
31 act_softmax (softmax)
32 reshape_2 (reshape) cpu:qint8 (1)
33 sink_0 (sink)
?
ROUTINES cpu:qint8
TOTAL ?
rm¶
Remove profiling information for the given routine.
Usage
usage: softneuro rm [--dnn DNN] [--pass PASSWORD] [--ref REF] [--ref-pass REF_PASSWORD] [--help] PROF [ROUTINE@IDS]...
Arguments
Argument | Description |
---|---|
PROF | Directory containing profiling data. |
ROUTINE[@LAYER_INDICES] | Set the routine given by ROUTINE to the layers given by LAYER_INDICES . If LAYER_INDICES isn't set, the routine will be set to all layers in the main network. The ROUTINE format can be checked with the softneuro help routine_desc command, and the LAYER_INDICES format can be checked with the softneuro help layer_indices command. |
Flags
Flag | Description |
---|---|
--dnn DNN | A DNN file used to create profiling data. |
-p PASSWORD, --pass PASSWORD | The password required to use the prof file. |
--ref REF | the reference dnn file when profiling a secret dnn. |
--ref-pass REF_PASSWORD | the password for REF. |
-h, --help | Shows the command help. |
Example
Remove all routine settings from mobilenet_prof.
$ softneuro rm mobilenet_prof
removing routines...done.
$ softneuro status mobilenet_prof
[preprocess]
# NAME ROUTINE TIME DESC PARAMS
0 ? (source)
1 ? (madd)
2 ? (sink)
?
[main]
# NAME ROUTINE TIME DESC PARAMS
0 input_1 (source)
1 conv1 (conv2)
2 conv_dw_1 (depthwise_conv2)
3 conv_pw_1 (conv2)
4 conv_dw_2 (depthwise_conv2)
5 conv_pw_2 (conv2)
6 conv_dw_3 (depthwise_conv2)
7 conv_pw_3 (conv2)
8 conv_dw_4 (depthwise_conv2)
9 conv_pw_4 (conv2)
10 conv_dw_5 (depthwise_conv2)
11 conv_pw_5 (conv2)
12 conv_dw_6 (depthwise_conv2)
13 conv_pw_6 (conv2)
14 conv_dw_7 (depthwise_conv2)
15 conv_pw_7 (conv2)
16 conv_dw_8 (depthwise_conv2)
17 conv_pw_8 (conv2)
18 conv_dw_9 (depthwise_conv2)
19 conv_pw_9 (conv2)
20 conv_dw_10 (depthwise_conv2)
21 conv_pw_10 (conv2)
22 conv_dw_11 (depthwise_conv2)
23 conv_pw_11 (conv2)
24 conv_dw_12 (depthwise_conv2)
25 conv_pw_12 (conv2)
26 conv_dw_13 (depthwise_conv2)
27 conv_pw_13 (conv2)
28 global_average_pooling2d_1 (global_average_pool)
29 reshape_1 (reshape)
30 conv_preds (conv2)
31 act_softmax (softmax)
32 reshape_2 (reshape)
33 sink_0 (sink)
?
TOTAL ?
?
reset¶
Resets profiling data to defaults.
Usage
usage: softneuro reset [--dnn DNN] [--pass PASSWORD] [--ref REF] [--ref-pass REF_PASSWORD] [--help]
PROF [ROUTINE@IDS]...
Arguments
Argument | Description |
---|---|
PROF | Directory containing profiling data. |
ROUTINE[@LAYER_INDICES] | Set the routine given by ROUTINE to the layers given by LAYER_INDICES . If LAYER_INDICES isn't set, the routine will be set to all layers in the main network. The ROUTINE format can be checked with the softneuro help routine_desc command, and the LAYER_INDICES format can be checked with the softneuro help layer_indices command. |
Flags
Flag | Description |
---|---|
--dnn DNN | A DNN file used to create profiling data. |
-p PASSWORD, --pass PASSWORD | The password required to use the prof file. |
--ref REF | the reference dnn file when profiling a secret dnn. |
--ref-pass REF_PASSWORD | the password for REF. |
-h, --help | Shows the command help. |
Example
Reset the mobilenet_prof profiling data.
$ softneuro reset mobilenet_prof
resetting routines...done.
$ softneuro status mobilenet_prof
[preprocess]
# NAME ROUTINE TIME DESC PARAMS
0 ? (source)
1 ? (madd) cpu (3)
2 ? (sink)
?
[main]
# NAME ROUTINE TIME DESC PARAMS
0 input_1 (source)
1 conv1 (conv2) cpu (15)
2 conv_dw_1 (depthwise_conv2) cpu (3)
3 conv_pw_1 (conv2) cpu (47)
4 conv_dw_2 (depthwise_conv2) cpu (3)
5 conv_pw_2 (conv2) cpu (47)
6 conv_dw_3 (depthwise_conv2) cpu (3)
7 conv_pw_3 (conv2) cpu (47)
8 conv_dw_4 (depthwise_conv2) cpu (3)
9 conv_pw_4 (conv2) cpu (47)
10 conv_dw_5 (depthwise_conv2) cpu (3)
11 conv_pw_5 (conv2) cpu (47)
12 conv_dw_6 (depthwise_conv2) cpu (3)
13 conv_pw_6 (conv2) cpu (47)
14 conv_dw_7 (depthwise_conv2) cpu (3)
15 conv_pw_7 (conv2) cpu (47)
16 conv_dw_8 (depthwise_conv2) cpu (3)
17 conv_pw_8 (conv2) cpu (47)
18 conv_dw_9 (depthwise_conv2) cpu (3)
19 conv_pw_9 (conv2) cpu (47)
20 conv_dw_10 (depthwise_conv2) cpu (3)
21 conv_pw_10 (conv2) cpu (47)
22 conv_dw_11 (depthwise_conv2) cpu (3)
23 conv_pw_11 (conv2) cpu (47)
24 conv_dw_12 (depthwise_conv2) cpu (3)
25 conv_pw_12 (conv2) cpu (47)
26 conv_dw_13 (depthwise_conv2) cpu (3)
27 conv_pw_13 (conv2) cpu (47)
28 global_average_pooling2d_1 (global_average_pool) cpu (1)
29 reshape_1 (reshape) cpu (1)
30 conv_preds (conv2) cpu (47)
31 act_softmax (softmax) cpu (1)
32 reshape_2 (reshape) cpu (1)
33 sink_0 (sink)
?
ROUTINES cpu
TOTAL ?
status¶
Show the routines, parameters and measured profiling times for each layer.
Usage
usage: softneuro status [--dnn DNN] [--pass PASSWORD] [--ref REF] [--ref-pass REF_PASSWORD] [--at INDEX]
[--estimate MODE] [--csv] [--help]
PROF
Arguments
Argument | Description |
---|---|
PROF | Directory containing profiling data. |
Flags
Flag | Description |
---|---|
--dnn DNN | A DNN file used to create profiling data. |
-p, --pass PASSWORD | The password required to use the encrypted prof file. |
--ref REF | the reference dnn file when profiling a secret dnn. |
--ref-pass REF_PASSWORD | the password for REF. |
-@, --at INDEX | Show only the information for the layer at the given index. |
--estimate MODE | Execution time estimation mode. Can be robust (default), min or ave . |
--csv | Output information in CSV format. |
--help | Show the command help. |
Example
The example information is for after running the profile
command to measure execution times.
$ softneuro status mobilenet_prof
[preprocess]
# NAME ROUTINE TIME DESC PARAMS
0 ? (source)
1 ? (madd) cpu (3) 28 cpu/avx {"ops_in_task":16384}
2 ? (sink)
28
[main]
# NAME ROUTINE TIME DESC PARAMS
0 input_1 (source)
1 conv1 (conv2) cpu (15) 213 cpu/owc64_avx {"cache":8192,"task_ops":131072}
2 conv_dw_1 (depthwise_conv2) cpu (3) 110 cpu/owc32_avx {"cache":8192,"task_ops":65536}
3 conv_pw_1 (conv2) cpu (47) 195 cpu/m1x1l_avx {"cache":1048576,"oxynum_in_task":144}
4 conv_dw_2 (depthwise_conv2) cpu (3) 60 cpu/owc32_avx {"cache":8192,"task_ops":32768}
5 conv_pw_2 (conv2) cpu (47) 177 cpu/m1x1l_avx {"cache":1048576,"oxynum_in_task":72}
6 conv_dw_3 (depthwise_conv2) cpu (3) 113 cpu/owc32_avx {"cache":8192,"task_ops":65536}
7 conv_pw_3 (conv2) cpu (47) 328 cpu/m1x1l_avx {"cache":1048576,"oxynum_in_task":36}
8 conv_dw_4 (depthwise_conv2) cpu (3) 40 cpu/owc32_avx {"cache":8192,"task_ops":32768}
9 conv_pw_4 (conv2) cpu (47) 167 cpu/m1x1l_avx {"cache":1048576,"oxynum_in_task":96}
10 conv_dw_5 (depthwise_conv2) cpu (3) 68 cpu/owc32_avx {"cache":8192,"task_ops":131072}
11 conv_pw_5 (conv2) cpu (47) 320 cpu/m1x1l_avx {"cache":1048576,"oxynum_in_task":96}
12 conv_dw_6 (depthwise_conv2) cpu (3) 23 cpu/owc32_avx {"cache":8192,"task_ops":65536}
13 conv_pw_6 (conv2) cpu (47) 164 cpu/m1x1l2_avx {"cache":1048576,"oxynum_in_task":16}
14 conv_dw_7 (depthwise_conv2) cpu (3) 34 cpu/owc32_avx {"cache":8192,"task_ops":65536}
15 conv_pw_7 (conv2) cpu (47) 313 cpu/m1x1l2_avx {"cache":1048576,"oxynum_in_task":16}
16 conv_dw_8 (depthwise_conv2) cpu (3) 34 cpu/owc32_avx {"cache":8192,"task_ops":65536}
17 conv_pw_8 (conv2) cpu (47) 313 cpu/m1x1l2_avx {"cache":1048576,"oxynum_in_task":16}
18 conv_dw_9 (depthwise_conv2) cpu (3) 34 cpu/owc32_avx {"cache":8192,"task_ops":65536}
19 conv_pw_9 (conv2) cpu (47) 313 cpu/m1x1l2_avx {"cache":1048576,"oxynum_in_task":16}
20 conv_dw_10 (depthwise_conv2) cpu (3) 34 cpu/owc32_avx {"cache":8192,"task_ops":65536}
21 conv_pw_10 (conv2) cpu (47) 313 cpu/m1x1l2_avx {"cache":1048576,"oxynum_in_task":16}
22 conv_dw_11 (depthwise_conv2) cpu (3) 34 cpu/owc32_avx {"cache":8192,"task_ops":65536}
23 conv_pw_11 (conv2) cpu (47) 313 cpu/m1x1l2_avx {"cache":1048576,"oxynum_in_task":16}
24 conv_dw_12 (depthwise_conv2) cpu (3) 14 cpu/owc32_avx {"cache":8192,"task_ops":65536}
25 conv_pw_12 (conv2) cpu (47) 169 cpu/m1x1l2_avx {"cache":1048576,"oxynum_in_task":16}
26 conv_dw_13 (depthwise_conv2) cpu (3) 19 cpu/owc32_avx {"cache":8192,"task_ops":32768}
27 conv_pw_13 (conv2) cpu (47) 336 cpu/m1x1l2_avx {"cache":1048576,"oxynum_in_task":8}
28 global_average_pooling2d_1 (global_average_pool) cpu (1) 13 cpu/naive {}
29 reshape_1 (reshape) cpu (1) 0 cpu {}
30 conv_preds (conv2) cpu (47) 23 cpu/owc64_avx {"cache":8192,"task_ops":32768}
31 act_softmax (softmax) cpu (1) 21 cpu/naive {}
32 reshape_2 (reshape) cpu (1) 0 cpu {}
33 sink_0 (sink)
4,308
ROUTINES cpu
TOTAL 4,336
profile¶
Run profiling based on profiling data.
Usage
usage: softneuro profile [--dnn DNN] [--pass PASSWORD] [--help] PROF
Arguments
Argument | Description |
---|---|
PROF | Directory containing profiling data. |
Flags
Flag | Description |
---|---|
--dnn DNN | A DNN file used to create profiling data. |
-p, --pass PASSWORD | The password required to use the encrypted prof file. |
--help | Shows the command help. |
Example
After using the init
command to generate profiling data, the profile
command measures execution times and saves the profiling information into the profiling data directory.
$ softneuro prof mobilenet_prof
profiling...100.0% [00:01]
tune¶
Tune a DNN file for faster inference times. If profiling data isn't provided, the command automatically runs profiling.
Usage
usage: softneuro tune [--prof PROF] [--recipe RECIPE] [--thread NTHREADS]
[--affinity MASK[@THREAD_INDICES]] [--pass PASSWORD]
[--routine ROUTINE[@IDS]]... [--estimate MODE] [--help]
INPUT OUTPUT
Arguments
Argument | Description |
---|---|
INPUT | DNN file to be tuned. |
OUTPUT | Output tuned DNN file. |
Flags
Flag | Description |
---|---|
--prof PROF | Directory containing profiling data. |
--recipe RECIPE | Directory containing recipe data. |
--thread NTHREADS | How many threads to be used on execution. Defaults to the amount of CPU cores. |
--affinity MASK[@THREAD_INDICES] | Use the affinity mask given by MASK on the threads given byTHREAD_INDICES . MASK should be a little endian hexadecimal (0x..), binary (0b..), or decimal number. If THREAD_INDICES isn't set all threads will use the given mask. For more information on THREAD_INDICES use the softneuro help thread_indices command. |
-p, --pass PASSWORD | Password if the DNN file is encrypted. |
-r, --routine ROUTINE[@LAYER_INDICES] | Set the routine given by ROUTINE to the layers given by LAYER_INDICES . If LAYER_INDICES isn't set, the routine will be set to all layers in the main network. The ROUTINE format can be checked with the softneuro help routine_desc command, and the LAYER_INDICES format can be checked with the softneuro help layer_indices command. |
--estimate MODE | Execution time estimation mode. Can be robust (default), min or ave . |
-h, --help | Shows the command help. |
Example
After tuning the vgg16_tuned.dnn file will be created.
$ softneuro tune vgg16.dnn vgg16_tuned.dnn
adding cpu routines...done.
profiling...100.0% [00:56] ETA[00:00]
[preprocess]
# NAME ROUTINE TIME DESC PARAMS
0 ? (source)
1 ? (permute) cpu (1) 155 cpu/naive {}
2 ? (madd) cpu (3) 29 cpu/avx {"ops_in_task":16384}
3 ? (sink)
184
[main]
# NAME ROUTINE TIME DESC PARAMS
0 input_1 (source)
1 block1_conv1 (conv2) cpu (67) 1,239 cpu/owc64_avx {"cache":8192,"task_ops":131072}
:
TOTAL 59,463
Tuning for OpenCL usage:
$ softneuro tune --routine opencl/fast@2..23 --routine cpu@1,24 vgg16.dnn vgg16_tuned.dnn
profiling..100.0% [01:23] ETR[00:00]
Tuning for OpenCL(float16) usage:
$ softneuro tune --routine opencl:float16/fast@2..23 --routine cpu@1,24 vgg16.dnn vgg16_tuned.dnn
profiling..100.0% [01:23] ETR[00:00]
Tuning for CUDA usage:
$ softneuro tune --routine cuda/fast@2..23 --routine cpu@1,24 vgg16.dnn vgg16_tuned.dnn
profiling..100.0% [01:23] ETR[00:00]
Tuning for CUDA(float16) usage:
$ softneuro tune --routine cuda:float16/fast@2..23 --routine cpu@1,24 vgg16.dnn vgg16_tuned.dnn
profiling..100.0% [01:23] ETR[00:00]
Tuning for 8bit quantization mode:
$ softneuro tune --routine cpu:qint8/fast vgg16.dnn vgg16_tuned.dnn
profiling..100.0% [01:23] ETR[00:00]