What if you could process complex deep learning models at scale without specialized AI hardware? Award-winning MIT mathematics professor Nir Shavit will demonstrate how to make inference scale in the real world using only software on commodity CPUs. The Neural Magic Inference Engine bucks conventional wisdom about high-throughput computing, and may change your mind about machine learning forever.
For nearly a decade, data scientists have believed that deep learning models need the parallel computing power of a GPU to get “good enough” results. GPUs, however, have limited cache memory, which requires a data scientist to sacrifice accuracy for performance gains to achieve inference at scale.
CPUs, by contrast, have much more sophisticated cache memory hierarchies that are perfect for addressing these constraints. This session will show how to run deep neural networks at scale with GPU-class performance on commodity CPUs, with all of the deployment flexibility of a software solution. Data science teams can run models at exceptional speeds, without the expense and complexity of dedicated hardware.
Abstract & Bio
The Software GPU: Making Inference Scale in the Real World