Released Neural Network Libraries v1.11.0!
Spotlight
CUDA extension: If allReduce(e.g. MPI_Allreduce()) has no response, system will raise an exception.
In multi-node training, developers are often confused whether training is actually hang up, or simply some calculation takes too much time. We may identify them by checking whether AllReduce() is returned within a certain time. Normally, we may use barrier() to set a synchronization point. Before this synchronization point, we allow different nodes to perform different calculation tasks whose processing time may vary. But after this synchronization point, we assume similar calculations are performed in multiple nodes, hence, the allReduce() should be finished within a certain period of time. If not, it is suggested that some node might have no response. This feature can stop training once any node occurs exception.
C-runtime: Support fixed-point operators, avoid to frequently convert between float and fixed-point
Add fixed-point function version for the following functions:
– DepthwiseConvolution
– AddScalar, MulScalar
– Reshape
– Transpose
– Affine
– ReLU
– Add2, Sub2
– Mul2, Div2
– SumPooling
– MaxPooling
– AveragePooling
Layers
Utilities
- feat: determine slow gpu in multinode training
- feat: use multiprocess to create cache and add the test file of create_cache
- Fix normalization with 16bit integer data.
- enable .h5 parameter saving.
- cpu gpu load measure config separately
- Redesign the current graph converter
Format Conversion
Examples
Documentation
- Improve NNabla converter doc.
- Specify the converter related packages version in the doc.
- update cuda cudnn table, update openmpi installation tutorial
Bugfix
- Temporarily skip new_group test.
- Fix code typo
- Check shape length before checking first element.
- Fix the gradient operation when inplaced with output variable.
- Fix the gradient operation when inplaced with output variable.
- Fix Deeplab Inference Bug
- Fixed issue that multiple processes access the same file on multinode training
- Fix the gradient operation of ScatterNd when inplaced with output variable.