We are excited to share neon’s v1.2 release with the community, which has several major features (Kepler support, new macrobatch and serialization enhancements) and examples, along with an expanded Model Zoo to help users get started with their use cases.
Some benchmark numbers for neon vs. Caffe (using cuDNNv3), and for Nervana Cloud (Titan X) vs AWS (Grid K520) are below (smaller numbers are better). Clearly, neon with Maxwell GPUs and our Cloud are still the recommended ways of using neon for the best performance (typically 10x faster vs. using AWS). Even though we did not prioritize optimizing for AWS, we surpassed cuDNN v3 performance for fprop (inference) for AlexNet on AWS. Also, note that these networks typically run for several days or weeks, and these are just times for 1 iteration, so even small differences here could correspond to hours or days saved by users using the Nervana Platform over using AWS. Combined with our multi-GPU implementation we can achieve a ~70x speedup over AWS g2.2xlarge performance.
GoogLeNet and VGG are too large to fit on AWS GPUs. Numbers below are for Nervana Cloud (Titan X).
We continue to top the speed benchmarks, and are continuously working on improving ease of use. Expanding our automatic differentiation feature beyond individual layers to work with full networks is our next major milestone to make exploratory investigations even easier. We look forward to the creative ways in which the deep learning community will use neon. Drop us a note at email@example.com with any feedback (both positive and negative!).