Training deep learning networks means moving a lot of data, and using best-in-class memory technologies. The Intel Nervana NNP leverages high-capacity, high-speed High Bandwidth Memory to provide the maximum level of on-chip storage and blazingly-fast memory access, and utilizes separate pipelines for computation and data management, so new data is available faster.
To achieve higher degrees of throughput for neural network workloads, we have invented Flexpoint, a new numerical data format for the Intel Nervana NNP that delivers higher speed and higher compute density than conventional numerical formats. Flexpoint enables a vast increase in parallelism on a die while simultaneously decreasing power per computation.
Designed with high speed on- and off-chip interconnects, the Intel Nervana NNP enables massive bi-directional data transfer distributed across multiple chips. This makes multiple chips act as one large virtual chip that can accommodate larger models, allowing customers to capture more insight from their data.
Nervana is currently developing the Nervana Engine, an application specific integrated circuit (ASIC) that is custom-designed and optimized for deep learning. Training a deep neural network involves many compute-intensive operations, including matrix multiplication of tensors and convolution. Graphics processing units (GPUs) are more well-suited to these operations than CPUs since GPUs were originally designed for video…
Get the latest from Intel AI