EVALUATION

Running (exciting) the network

After a network is trained, we want to run it with new data to perform some task in an actual application or to evaluate its performance using a special set of evaluation/testing data. The NICO toolkit has one main tool for running networks and several tools for evaluating various aspects of network performance.

Excite is the command used for running a network. Here is the syntax:

USAGE: Excite [options] Net Input
       Option                                                        Default
       -S        Treat 'Input' as a script file holding inputfiles   (off)
       -x ext    Extension for outputfiles                           (act)
       -d dir    Directory to store output files in                  (current)
       -m object Mark a unit, group or stream to be shown            (outputs)
       -e        Show error gradients (also does '-x grad')          (act's)
       -w        'Winner takes all', highest activity =1.0, others =0.0
       -b num    For each frame, show the names of the n highest outputs
       -c        Add a column, counting framenumber.                 (off)
       -B        Output binary floats (4 bytes)                      (text)
       -X stream Output a stream                                     (off)
       -s stream Output a softmax-stream                             (off)
       -n freq   Print the names of the units every 'freq' frames    (off)
       -h        Select high (6 figures) resolution                  (2 figures)
       -T level  Trace level                                         (0)

Excite can display activities of selected units or stream-vectors, but the default is to display the activities of all output units. The data can be outputted in several different formats and is written to a file with the same base name as "Input".

The -m option is used to mark a particular unit, group or stream to be monitored. There can be multiple -m options on the command line to monitor more than one object. If the command line has no -m options, the activities of all output units are selected.

If the output units are named, the -b option can be useful. It makes Excite output the names of the "num" units with the highest activity at each time (the list of "num" names is ordered by activity).

Just like the training tools, Excite can be run either with one input file, or with a list of files who will then be processed one by one. This is controlled with the -S option. If the -S option is chosen, then "Input" is treated as a script file holding list of filenames, otherwize "Input" is simply the file name.

Classification (1 of N) performance evaluation

For classification tasks, CResult is the most important evaluation tool. A classification network is a network trained to discriminate between C disjunct classes. One output unit in the network is assigned to each class and at each time, the "choice" of the network is the class of the unit with the highest activity. CResult examines the target files, excites the network with input files and computes statistics about the resulting classification. It has the following syntax:

USAGE: CResult [options] Net Input
       Option                                                        Default
       -S        Treat 'Input' as a script file holding inputfiles   (off)
       -m object Mark a unit or a group to be tested         (all output units)
       -c        Print confusion matrix                              (off)
       -n N      Show "within-top-N" statistics                      (off)
       -x label  Exclude frames were 'label' has highest target      (off)

The -m option is used to mark a particular unit or group to be evaluated. There can be multiple -m options on the command line. If the command line has no -m options, all output units are selected.

The -n option shows how well the classification network performes in cases when it failed to "choose" the correct class. It makes CResult show how often the correct class was within the N nodes with higest activity.

In some experiments, one or more classes are less interesting than the others, but they could still cover a large part of the data. In other words: the rare classes are the interesting ones, so including the less interesting but common classes in the statistics, blurs the picture. In these cases the -x option is useful since it removes the samples where the correct class is "label" from the statistics. "label" is the name of an output unit.