Skip to content



SoftNeuro uses a file format that we refer to as DNN to store neural network models. A single DNN can contain multiple networks(Net) . For instance, there are DNNs that contain three Nets called preprocess, main, and postprocess. The file extension for DNN files is .dnn Further, the parameters of layers and routines that have not been stored in the .dnn files are automatically set to default values at execution time.


A Net is a single instance of a network that has been saved inside a DNN. A Net can consist of multiple Layers that form a neural network. Normally, a DNN contains a primary network called main. Additional networks called preprocess and postprocess might also be connected to this network, for operations indicated by their names.


A Layer corresponds to a single layer in a neural network. During execution, each layer is assigned a Routine setting. The basic operation is defined for each Layer, but the actual algorithm and instruction set for processing may vary by Routine.


A Routine specifies the internal implementation of calculations performed by a given layer. For widely used layers such as conv2 and depthwise_conv2, multiple routines have been implemented. Since environment-dependent implementations are generally faster, ability to choose the best routine for the current environment enables speed optimization. For instance, the conv2 layer contains a cpu routine for processing in the CPU, and a cuda routine for processing in CUDA-enabled GPUs.

Routine descriptor

A routine descriptor is a notation that describes how a routine is implemented. When profiling a DNN using the profile and run commands of the CLI tool,
the --routine option can be used to specify a routine using this notation. You can also confirm which routines have been used in a given layer, by using the routines command.

A routine descriptor has the format specified below. The parameter DEVICE is required.

parameter name content
DEVICE Device name: cpu, opencl, cuda, etc.. can be used.
DTYPE Data type used for calculations: float32 (default), float16, qint8, etc. can be used.
CH Position of channels in the input tensor:
chf(channels-first) and chl (channels-last: default) can be used.
LEVEL Level of the algorithm used: naive, fast (default), faster can be used.
 The levels are listed in ascending order of speed; however, the accuracy might go down with increasing speed.
AUX Parameters that pass additional information. For instance, this can be used to set a specific algorithm
The default value is ('').

The following are some examples:

  • cpu/faster/wg3x3_nxn
    DEVICE='cpu', DTYPE='float32', CH='chl', LEVEL='faster', AUX='wg3x3_nxn'
  • opencl:float16
    DEVICE='opencl', DTYPE='float16', CH='chl', LEVEL='fast', AUX=''
  • cpu:chf/naive
    DEVICE='cpu', DTYPE='float32', CH='chf', LEVEL='naive', AUX=''

Plugin libraries

In addition to libsoftneuro, the SoftNeuro main library, there are wrapper libraries that are dependent on various environments and platforms. These plugin libraries are automatically loaded when they are placed in the same directory as libsoftneuro. Only the plugin libraries that are appropriate for your platform should be copied.

The following table describes these plugin libraries.

Plugin library Description
plugin_avx2 Plugin library wrapping the implementation that uses AVX2 instruction set for Intel CPUs.
plugin_neon2 Plugin library wrapping the implementation that uses NEON (ARMv8) instruction set for ARM CPUs.
plugin_neon3 Plugin library wrapping the implementation that uses NEON (ARMv8) instruction set for ARM CPUs.
plugin_cuda-xx.x_cudnn-y.y Plugin library wrapping the implementation that uses CUDA and Cudnn libraries. Can only be used on platforms that support the specified versions of CUDA (xx.x) and Cudnn (y.y).
plugin_opencl Plugin library wrapping the implementation that uses OpenCL. Can only be used on platforms that support OpenCL(2.0+).