Released Neural Network Libraries v1.24.0!

Friday, December 24, 2021


Posted by shin

We have released Neural Network Libraries v1.24.0! Improved DDPM and NeRF models have been added, as well as speedups for several CUDA implementations. Also, we now support Python 3.9.


Improved DDPM

We have implemented nnabla version of Improved Denoising Diffusion Probabilistic Models (ICML 2021)! As a sampling method during generation, Denoising Diffusion Implicit Models (ICLR 2021) is also included.

Diffusion models are a branch of generative models that have recently gained attention, as they can generate high-quality images that are comparable, if not better, to images generated by GANs, while allowing for stable training. For further details, please refer to respective papers or our implementation in nnabla-examples repository



We have implemented Neural Radiance Fields (NeRF), a deep learning model for novel view synthesis based on volumetric rendering! Two variants of NeRF have been implemented:

  • Original NeRF: This has verified across 20 scenes from LLFF (realistic, forward facing), DeepVoxel and Blender (synthetic) datasets and the test performance has been benchmarked against the original implementation
  • NeRF in the Wild (NeRF-W): This NeRF variant allows novel view synthesis of a scene from a set of unconstraint set of photos of that scene. It has been verified across synthetic lego scene with artificially added transient occluders and appearance variation and multiple scenes from the phototourism dataset (Sacre Coeur, Brandenburg Gate, Taj Mahal, Hagia Sophia, Notre-Dame).

Optimization of Instance Normalization (CPU / GPU)

We have implemented a kernel that broadcasts scale and bias at the same time, making CUDA implementation of Instance Normalization faster. Both forward and back propagation operations are now significantly faster compared to previous nnabla implementation in all cases except when the batch size is 1. (The speedup rate depends on the input shape, but it ranges from several tens to 100 times faster.)

Optimization of CumProd / CumSum (CPU / GPU)

We have implemented memory optimization and speedup for CUDA implementation of CumProd and CumSum. For arbitrary input shape, both forward and back propagation operations are now significantly faster compared to the previous nnabla implementation. (The speedup rate depends on the input shape, but ranges from several tens to a thousand times faster)

Enhanced recomputation API

We have enhanced our recomputation API, which discards the results from forward computation and re-computes it during backward computation for reduction of memory usage during training. You can easily set a range for recomputation using Python’s with as shown below:

x = nn.Variable(...)

with nn.recompute():
    h = net1(x)  # all intermediate variables will be set as recompute=True

y = net2(h)

y.forward()  # variables in net1 will be cleared from memory
y.backward()   # variables in net1 will be recomputed when required. 

StyleGAN2 Training

Following the release of StyleGAN2-CDC and StyleGAN2-EWC in v1.23, we now present StyleGAN2 training implementation in nnabla. The implementation has been verified on FFHQ dataset and supports additional inference operations such as latent space interpolation, latent space projection and perceptual path length calculation.

Support for Python3.9 (CPU / GPU)

We have added support for python3.9. Along with this update, tensorflow used by file format converter has also been updated to v2.5.1.


Format Converter