www.ptreview.co.uk
08
'23
Written on Modified on
CEVA News
CEVA DOUBLES DOWN ON GENERATIVE AI WITH ENHANCED NEUPRO-M NPU IP FAMILY
NeuPro-M delivers an industry-leading 350 TOPS/Watt to bring the power of Generative AI to infrastructure, industrial, automotive, PC, consumer, and mobile markets with exceptional cost and power efficiency.
CEVA, Inc. announced its enhanced NeuPro-M NPU family, directly addressing the processing needs of the next era of Generative AI with industry-leading performance and power efficiency for any AI inferencing workload, from cloud to the edge. The NeuPro-M NPU architecture and tools have been extensively redesigned to support transformer networks in addition to CNNs and other neural networks, as well as support for future machine learning inferencing models. This enables highly-optimized applications leveraging the capabilities of Generative and classic AI to be seamlessly developed and run on the NeuPro-M NPU inside communication gateways, optically connected networks, cars, notebooks and tablets, AR/VR headsets, smartphones, and any other cloud or edge use case.
By evolving inferencing and modelling techniques, new capabilities for leveraging smaller, domain-specific LLMs, vision transformers and other generative AI models at the device level are set to transform applications in infrastructure, industrial, mobile, consumer, automotive, PC, consumer, and mobile markets. Crucially, the enhanced NeuPro-M architecture is highly versatile and future-proof thanks to an integrated VPU (Vector Processing Unit), supporting any future network layer. Additionally, the architecture supports any activation and any data flow, with true sparsity for data and weights that enable up to 4X acceleration in performance, allowing customers to address multiple applications and multiple markets with a single NPU family.
To enable larger scalability that is required by diverse AI markets, the NeuPro-M adds new NPM12 and NPM14 NPU cores, with two and four NeuPro-M engines, respectively, to easily migrate to higher performance AI workloads, with the enhanced NeuPro-M family now comprising four NPUs – the NPM11, NPM12, NPM14, and NPM18. This versatility along with exceptional performance and power efficiency make NeuPro-M the leading NPU IP available in the industry today, with a peak performance of 350 TOPS/Watt at a 3nm process node and capable of processing more than 1.5 million tokens per second per watt for a transformer-based LLM inferencing.
Accompanying the enhanced NeuPro-M architecture is a revamped comprehensive development toolchain, based on CEVA's award-winning neural network AI compiler, CDNN, which is architecture aware for full utilization of the NeuPro-M parallel processing engines and for maximizing customer’s AI application performance. The CDNN software includes a memory manager for memory bandwidth reduction and optimal load balancing algorithms, and is compatible with common open-source frameworks, including TVM and ONNX.
For more information, visit https://www.ceva-dsp.com/product/ceva-neupro-m/.
www.ceva-dsp.com