Nervana Cloud 1.5.0 contains enormous under-the-hood changes and improvements.  We’ve revamped and updated a lot of the core underlying code, separated the various application components into their own microservices, re-written our job launcher, added support for a new container orchestration service, squashed more than 75 bugs, and greatly expanded our testing coverage. The biggest changes visible to the end user are the aeon dataloader and auxiliary file volume support.

Aeon dataloader – The aeon dataloader enables fast and flexible access to training data sets that are too large to load directly into memory. Data is first loaded in chunks called “macrobatches” that are then split further into minibatches to feed the model. An easy interface enables you to configure the dataloader for custom datasets and to load data from disk with minimal latency.  A manifest file is used to indicate your local input and target paths.  On the backend we’ve built an entirely new data service to handle fetching, caching, and serving requests for dataset batches. See the Aeon User Guide for more information.

Auxiliary file volume support – We’ve also introduced support for arbitrary file volumes to handle non-aeon formatted data such as older datasets, vocabulary files, and output data.  The volumes are mounted read/write during model training and inference jobs. You can append new files to existing volumes or download their contents. See Attaching Data for details on how to use this feature.

This release of Nervana Cloud includes a number of additional features and fixes:

  • New ncloud commands and API endpoints for retrieving command history, getting machine information, and revoking access tokens.
  • Automatic command retry and other enhancements to improve stability when uploading large individual files and directories of many small files. In addition, batch sizes are now configurable to better cope with network lag and disruptions. We also cap the number of simultaneous open file descriptors.
  • Enhancements such as automatic scaling and load balancing to improve streaming inference performance.
  • Revamped administration of users, groups, and tenants to improve the content displayed and fix removal operations in certain scenarios.  Added administration support via web user interface.
  • Nervana Cloud now defaults to neon v1.9.0 — i.e., all training jobs, interactive Jupyter sessions, and model deployment jobs will now assume neon v1.9.0 unless you explicitly override them to use a different version.

Related Blog Posts

neon™ 2.6.0: Inference Optimizations for Single Shot MultiBox Detector on Intel® Xeon® Processor Architectures

We are excited to release the neon™ 2.6.0 framework, which features improvements for CPU inference path on a VGG-16 based Single Shot multibox Detector (SSD) neural network. These updates, along with the training optimizations released in neon 2.5.0, show that neon is gaining significant boosts in both training and inference performance.  (Granular configuration details, as well…

Read more

#Release Notes

Reinforcement Learning Coach v0.9

Since the release of Coach a couple of months ago, we have been working hard to push it into new frontiers that will improve its usability for real world applications. In this release, we are introducing several new features that will move Coach forward in this direction. Imitation Learning First, we added several convenient tools…

Read more

#Release Notes #Technology

neon v2.3.0: Significant Performance Boost for Deep Speech 2 and VGG models

We are excited to announce the release of neon™ 2.3.0.  It ships with significant performance improvements for Deep Speech 2 (DS2) and VGG models running on Intel® architecture (IA). For the DS2 model, our tests show up to 6.8X improvement1,4 with the  (Intel® MKL) backend over the NumPy CPU backend with neon™ 2.3.0, and more…

Read more

#Release Notes