Intel plans AI inference chip with Facebook

Intel plans AI inference chip with Facebook

Intel is working with Facebook to build a server processor dedicated to speeding performance of AI.

The chip giant, whose silicon dominates the planet’s server rooms, has announced Nirvana Neural Network Processor for Inference, NNP-I, which it said would be completed by year’s end.

Intel reckons NNP-I will accelerate inference for companies with high workload demands.

Additionally, Intel announced it’s working on a processor for neural network training – Neural Network Processor for Training, codenamed Spring Crest – due later this year.

Inference is the branch of Machine Learning where systems no longer give simply a “yes” or “no” answer to question based on data, but have learned from earlier training to deduce an answer based on the newly available data.

Inference is used in search engines, streaming video services like Netflix and by AI assistants such as Alexa.

Intel did not provide technical details for NNP-I but the chief hurdle it’ll need to overcome will be efficient performance.

That is, high throughput and low latency for both applications and a vast amount of data with the processor proving a parsimonious consumer of power and generating relatively little heat.

Throughput and latency are important not purely for application and data performance but also because the application must retrain the existing learning and apply that knowledge to the new data.

Intel follows Nvidia and Amazon in this field – the former announcing the Tesla T4 graphics card in September 2018 – while AWS came out with Inferentia in November.

Nvidia claims the T4 GPU is 40 times faster than the kind of CPU Intel is traditionally known for while AWS claims its chip, used with its EC2 and SageMaker Instances, will allow developers to reduce their inference costs by up to 75 per cent. Inferentia works with TensforFlow, Apache MXNet, Pytourch and ONNX while Google is using T4 GPUs through its cloud.

It’s unclear at this point whether Intel’s effort will see the market as you know it or simply be available to you through Facebook.

Facebook has been evaluating systems for inference, with the ultimate goal to pick one or two architectures for its server line up. Intel is one of the companies Facebook has worked with.