Enabling efficient AI workloads in the vehicle

How ARM optimises in-vehicle AI processing

2 min
Robert Day on stage in Detroit.

As automotive OEMs seek to embed advanced AI-driven functionalities in vehicles, they encounter a familiar challenge: how can these workloads be processed efficiently using the constrained compute resources already embedded in cars? At the Automotive Computing Conference 2025 in Detroit, Robert Day, Director Automotive GTM at Arm Inc., provided practical insights into this question.

From chatbots to diagnostics and predictive maintenance, Day showcased how AI applications can be optimised to run on standard ARM CPUs—without always requiring dedicated NPUs or GPUs. This is not just an exercise in frugality, but a crucial step towards scalable and sustainable AI in everyday vehicles.

AI in the car is no longer limited to computer vision and driver assistance. As Day emphasised, modern vehicles are becoming increasingly personalised, interactive, and responsive. New use cases are emerging rapidly: voice-controlled assistants, LLM-based chatbots, real-time diagnostics, and in-cabin personalisation.

However, these applications demand fast inference and low latency—something not always easy to achieve on embedded automotive hardware. Most in-vehicle CPUs are not over-provisioned. In fact, Day pointed out, “you’re typically constrained by the hardware you've got.” The solution? Make the most of what’s already there!

From the Cloud to the Car

In one example, Day described a chatbot originally developed by AWS to answer user questions about vehicle specs. It was first trained and run in the cloud, offering sub-two-second response times. However, when ported to an ARM-based in-vehicle system, latency ballooned to 20 seconds—far from acceptable.

Arm responded by integrating its AI optimisation framework, Clyde, into popular ML libraries like Llama.cpp. After optimisation, the chatbot’s latency dropped to 1-3 seconds, nearly matching its cloud performance. The result: natural language interaction in the car, powered by CPUs and made invisible to the developer thanks to seamless integration.

The key to Arm’s strategy is framework-level AI acceleration. Rather than requiring developers to hand-optimise every workload, Arm integrates optimisation libraries directly into common machine learning frameworks. These include TensorFlow and PyTorch, enabling real-time AI even on cost-sensitive hardware platforms.

Another example came from Sonatus, who used Arm’s cloud-based Graviton 4 CPUs and Clyde to run vehicle diagnostics traditionally reliant on GPUs. The result: GPU-like performance, but using CPUs—offering a cost-effective and more scalable alternative. As Day highlighted, this architecture opens the door for future onboard deployment, reducing the need to ping the cloud for every function.

Consumer-Centric AI

Perhaps the most relatable part of Day’s presentation came in the form of a personal story: trying (and failing) to change windscreen wipers. The solution wasn’t mechanical—it was informational. What if a vehicle-integrated chatbot could answer a simple voice query like: “How do I change the wipers?”

In his demonstration, the optimised chatbot searched the digital manual and returned a clear answer—all processed locally on a Raspberry Pi running Arm CPU architecture. The use case may be simple, but the implication is profound: contextual, vehicle-specific assistance, powered by local AI.

Another application featured AI-based security monitoring, developed by VicOne. The system analysed intrusion data directly in the vehicle, but initially suffered from sluggish performance. With Arm’s Clyde framework applied, performance improved by 60%—again, using standard automotive CPUs.

These examples reveal a clear trend: edge AI in vehicles isn’t theoretical—it’s technically feasible, cost-efficient, and developer-friendly when the right tools are available.

Arm v9 and the Future of Automotive AI

Looking ahead, Day hinted at future capabilities embedded in Arm v9 CPUs, designed with new AI instructions and real-time processing enhancements. Combined with Clyde-based acceleration, this architecture could support increasingly complex AI models—without shifting everything to specialised co-processors.

According to Day, Arm’s approach avoids a hardware arms race. Instead, it focuses on efficiency, developer accessibility, and ecosystem compatibility. The result? OEMs can introduce intelligent features faster, at lower cost, and with less architectural upheaval.