nnvm.testing

Utilities for testing and benchmarks

nnvm.testing.ctx_list()

Get context list for testcases

nnvm.testing.check_computation

Helper utilities to check functions and their gradients.

nnvm.testing.check_computation.check_function(symbol, forward=None, backward=None, grad_input_vars=None, shape=None, dtype=None, in_range=None, values=None, exclude_targets=None, only_targets=None, additional_params=None, numerical_grads=None, numerical_grads_params=None, atol=1e-05, rtol=1e-05, quiet=False)

Compute the function and/or its gradients on a random input and raise an exception if the result doesn’t match the reference implementation.

Parameters:
  • symbol (nnvm.Symbol) – A symbol representing the output.
  • forward (Callable[.., List[numpy.ndarray]], optional) – A reference implementation to compare with.
  • backward (Callable[.., List[numpy.ndarray] or Dict[str, numpy.ndarray]], optional) – A reference implementation of gradients. Should also accept head_grads besides normal inputs which is a list of gradients of some scalar wrt the outputs or just a single gradient if there are multiple outputs. Should return either a dict mapping input variable names to the respective gradients or a list of gradients wrt variables from grad_input_vars in exactly the same order (in alphabetical order by default).
  • grad_input_vars (List[nnvm.Symbol or str], optional) – A list of variables with respect to which the gradients will be computed. None (default) means that all input variables will be used in an alphabetical order.
  • shape (Dict[nnvm.Symbol or str, Tuple[int]] or Tuple[int], optional) – A dict mapping input variable names to shapes, or just a single shape. By default shapes will be inferred from variables’ attributes (see the Examples). Note that this parameter takes precedence over variables’ attributes.
  • dtype (Dict[nnvm.Symbol or str, str] or str, optional) – A dict mapping input variable names to dtypes, or just a single dtype. By default dtypes will be inferred from variables’ attributes (see the Examples). If dtypes cannot be inferred for some variables then float32 will be used as a fallback. Note that this parameter takes precedence over variables’ attributes.
  • in_range (Dict[nnvm.Symbol or str, (float, float)] or (float, float), optional) – A dict mapping input variable names to ranges or just a single range (the same for all variables). Input values will be generated from uniform distributions on these ranges. head_grads can also be assigned a range this way.
  • values (Dict[nnvm.Symbol or str, numpy.ndarray], optional) – A dict explicitly providing values for some variables instead of random generation.
  • exclude_targets (Set[str], optional) – Skip compiling and running anything for these targets.
  • only_targets (Set[str], optional) – Test only for those targets from ctx_list() that are also in this set.
  • additional_params (dict, optional) – A dict of additional parameters which will be passed to forward and backward.
  • numerical_grads (bool or 'if_possible', optional) – Whether to additionally check against numerically computed gradients. If ‘if_possible’ or None is passed (which is the default) then it will try to create a gradient computation graph and then check gradients numerically only if this graph can be created (i.e. if there are some operations with unimplemented gradients, it will just issue a warning). Checking against numerical gradients is done via the check_numerical_grads function.
  • numerical_grads_params (dict, optional) – Additional parameters for check_numerical_grads.
  • atol (float, optional) – Absolute tolerance for tvm.testing.assert_allclose. NOT used for numerical gradients.
  • rtol (float, optional) – Relative tolerance for tvm.testing.assert_allclose. NOT used for numerical gradients.
  • quiet (bool, optional) – Don’t dump additional information to stdout on failure.

Examples

x = sym.Variable("x", shape=(1, 2))
y = sym.Variable("y", shape=(1, 2))

# check the function and its gradients both numerically and using a reference function
check_function(x + 2*y,
               lambda x, y: x + 2*y,
               lambda x, y, head_grads: {'x': head_grads, 'y': 2*head_grads})

# just check gradients numerically
check_function(x + 2*y, numerical_grads=True)

# just check the forward computation
check_function(x + 2*y, lambda x, y: x + 2*y, numerical_grads=False)

# specifying dtype
check_function(x + 2*y, lambda x, y: x + 2*y, dtype='float64')

# dtypes can also be specified during variable creation with dtype codes
x = sym.Variable("x", dtype=0)
check_function(x + 1, shape=(2, 2), numerical_grads=True)
nnvm.testing.check_computation.graph_to_function(graph, target, ctx, shape=None, dtype=None)

Convert a graph to a function taking a keyword args and returning a list of results (both args and results are numpy arrays).

Example:

fun = graph_to_function(graph, llvm, cpu(0))
[res1, res2] = fun(x=np.zeros((1,2)), y=np.zeros((1,)))
Parameters:
  • graph (nnvm.graph.Graph) – A graph we want to convert to a function.
  • target (str or tvm.target.Target) – The build target
  • ctx (TVMContext) – The context to deploy the module.
  • shape (Dict[str, Tuple[int]], optional) – A dict mapping input variable names to shapes. By default shapes will be inferred from variables’ attributes. Note that this parameter takes precedence over variables’ attributes.
  • dtype (Dict[str, str] or str, optional) – A dict mapping input variable names to dtypes, or just a single dtype. By default dtypes will be inferred from variables’ attributes. Note that this parameter takes precedence over variables’ attributes.
Returns:

function

Return type:

Callable[.., List[numpy.ndarray]]

nnvm.testing.check_computation.infer_shapes_dtypes(graph, shape=None, dtype=None, fallback_dtype=None)

Runs dtype and shape inference passes on a graph and returns the resulting graph along with the inferred information.

Parameters:
  • graph (nnvm.graph.Graph) – A graph we want to run inference on.
  • shape (Dict[str, Tuple[int]] or Tuple[int], optional) – A dict mapping input variable names to shapes. By default shapes will be inferred from variables’ attributes. Note that this parameter takes precedence over variables’ attributes.
  • dtype (Dict[str, str] or str, optional) – A dict mapping input variable names to dtypes, or just a single dtype. By default dtypes will be inferred from variables’ attributes. Note that this parameter takes precedence over variables’ attributes.
  • fallback_dtype (str, optional) – A dtype that will be used for variables whose dtype can’t be inferred from other variables’ dtypes.
Returns:

  • graph (nnvm.graph.Graph) – The resulting graph with dtype and shape information on its nodes.
  • input_shapes (Dict[str, Tuple[int]]) – The inferred shapes of input variables merged with the shape dictionary.
  • input_dtypes (Dict[str, str]) – The inferred dtypes of input variables merged with the dtype dictionary.
  • output_shapes (List[Tuple[int]]) – The inferred shapes of outputs.
  • output_dtypes (List[str]) – The inferred dtypes of outputs.

Testing new operations

When adding new operations, it is a good idea to test them. Testing should be done with the function nnvm.testing.check_function. You should provide it with the symbol representing the result of a computation and a reference numpy implementation. By default, it will also check analytical gradients against numerical gradients if analytical gradients are implemented for your operation. You can also pass a reference implementation for the gradients, but numerical gradients will still be checked. Numerical gradient checking may be switched off explicitly, but doing this is not a good idea generally. Here is an example testing the logarithm operation:

import numpy as np
import nnvm
import nnvm.symbol as sym
from nnvm.testing.check_computation import check_function

x = sym.Variable("x")
y = sym.log(x)

def forward(x):
    return np.log(x)

def backward(head_grads, x):
    return [1. / x * head_grads]

dtype = "float32"
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, in_range=(0.001, 2.0), dtype=dtype, shape=shape)

If you run the code above, you might get an AssertionError in rare cases. That’s why it is recommended to run new tests a lot of times.

for _ in range(10000):
    check_function(y, forward, backward, in_range=(0.001, 2.0), dtype=dtype, shape=shape)

If you run the code above then sooner or later you will get an exception which may look like this:

AssertionError: Analytical and numerical grads wrt x differ too much
analytical grad = [
        ...
    ]
numerical grad = [
        ...
    ]
distance > atol*sqrt(n) + rtol*grad_norm
distance 308.50885009765625 > 0.01*55.42562584220407 + 0.1*2167.70703125

It means that either you have a mistake in the FGradient function or the numerical error is too high. Generally, if you look at the printed gradients and see that they differ only slightly or just in a single position, then it is a numerical error. But if the gradients look completely different, especially if many corresponding positions have different signs, then it must be something wrong with the analytical gradient implementation.

Then try to make this error reproducible, and also try to reduce the shape of inputs, but not too much, a vector of 10 elements is a reasonable choice. Also you won’t need reference functions forward and backward, and restricting the number of targets might also be a good idea. Since the error may manifest itself only in rare cases, you might want to run it in a loop.

shape = {'x': (10,)}
np.random.seed(42)

for _ in range(1000):
    check_function(y, in_range=(0.001, 2.0), dtype=dtype, shape=shape,
                   numerical_grads=True, only_targets=['llvm'])

Running this code will result in the following:

check_function failed while checking gradients numerically, here is the main graph
Graph(%x, %head_grads_0) {
  %x, shape=[10], dtype=0
  %head_grads_0, shape=[10], dtype=0
  %1 = log(%x), shape=[10], dtype=0
  %3 = elemwise_div(%head_grads_0, %x), shape=[10], dtype=0
  ret %1, %3, %head_grads_0
}
graph_attr_keys = [layout_inputs, dtype_num_unknown_nodes, dtype, shape_num_unknown_nodes, shape]

Generated inputs:
{'x': array([2.5660574e-01, 1.5313280e+00, 1.0232578e-03, 8.3371508e-01,
       1.0454979e+00, 1.1021420e-01, 1.9461832e+00, 4.5302454e-01,
       6.0909325e-01, 6.0858107e-01], dtype=float32), 'head_grads_0': array([0.4616029 , 0.00394617, 1.4589603 , 1.9337242 , 0.44936267,
       1.3264314 , 1.4840508 , 1.6970023 , 0.84583575, 0.60655886],
      dtype=float32)}

...

AssertionError: Analytical and numerical grads wrt x differ too much
analytical grad = [1.7988799e+00 2.5769596e-03 1.4257993e+03 2.3194065e+00 4.2980734e-01
 1.2035031e+01 7.6254421e-01 3.7459390e+00 1.3886802e+00 9.9667716e-01]
 numerical grad = [1.7948151e+00 1.9073486e-03 9.9268610e+02 2.3174286e+00 4.2915344e-01
 1.1980057e+01 7.6198578e-01 3.7412643e+00 1.3866425e+00 9.9563599e-01]
distance > atol*sqrt(n) + rtol*grad_norm
distance 433.11322021484375 > 0.01*3.1622776601683795 + 0.1*992.7716674804688

In this case the largest difference is in the 2nd position (starting from 0) which corresponds to input value 1.0232578e-03. This value is too close to the singularity, so the numerical derivative gets too imprecise. The solution is to shrink the range for x, here, for example, (0.002, 2.0) turned out to be enough. Don’t forget to run lots of tests, so that other people don’t get false positives.

for _ in range(100):
    check_function(y, in_range={x: (0.002, 2.0)}, dtype=dtype, shape=(1, 3, 32, 32),
                   numerical_grads=True, only_targets=['llvm'])

If you need a more precise control over which values get passed to the checking function, you can use values={x: ...}:

x_val = np.array([1.2594858e+00, 1.0960974e-01, 1.4975418e+00, 6.3585603e-01,
       1.2692513e-03, 1.0227472e+00, 9.4656967e-02, 5.5306298e-01,
       1.4142460e+00, 1.2631655e-01], dtype=np.float32)
check_function(y, values={x: x_val}, dtype=dtype, shape=shape,
               numerical_grads=True, only_targets=['llvm'])