ARM Announces the Cortex-R52 CPU: Deterministic & Safe, For ADAS & Moreby Ryan Smith on September 19, 2016 7:30 PM EST
Though it didn’t attract a ton of attention at the time, back in 2013 ARM announced the ARMv8-R architecture. An update for ARM’s architecture for real-time CPUs, ARMv8-R was developed to further the real-time platform by adding support for newer features such as virtualization and memory protection. At the time the company didn’t announce any specific CPU designs for the architecture, but rather just announced the architecture on its own.
Now just under 3 years later, ARM is announcing their first ARMv8-R CPU design this evening with the Cortex-R52. An upgrade of sorts to ARM’s existing Cortex-R5, the R52 is the company’s first implementation of ARMv8-R. R52 makes specific use of many of the new features enabled by the architecture, while improving performance at the same time. ARM is pitching the new CPU core at markets that need a safety-critical CPU – a market that the Cortex-R series has been in for a while – where the deterministic nature of the CPU’s execution model is critical to ensuring quick and accurate execution.
While the focus on today’s CPU design announcement is on functionality and utility over microarchitecture, ARM has revealed a bit about how the Cortex-R52 is organized under the hood. The microarchitecture is a direct evolution of the previous Cortex-R5. This means we’re looking at a dual-issue in-order execution pipeline, with a pipeline length of 8 stages. Broadly speaking, this description is very similar to that of the better-known Cortex-A7/A53 cores, which implies that this is a real-time optimized version of the basic elements in that design.
As the Cortex-R series is focused on determinism and real-time responsiveness over total performance, ARM doesn’t heavily promote these cores on the basis of performance. But at least within the Cortex-R family, they are talking about a performance increase of upwards of 35% in common CPU benchmarks. More important for this market than throughput however is responsiveness: for the R52, ARM has done some specific work to improve interrupt entry and context switching performance, doubling the former and achieving a staggering 14-fold increase on the latter.
The big deal here of course is the deterministic nature of the CPU. The entire microarchitecture is optimized to avoid variable time, non-deterministic operations, which is why it’s an in-order processor to begin with. This design extends to how memory is managed as well, with ARM avoiding a virtual memory system and its associated TLB translation-misses in favor of a model they call the Protected System Memory Architecture (PSMA), which is used in conjunction with an MPU to handle memory operations without the translation.
On the safety side of matters, the R52 has a few different error-resiliency features to ensure accuracy. Multi-core lock step returns for this design, allowing two R52 cores to execute the same task in parallel for redundancy. And on the memory side of matters, ECC is offered across both the memory busses and the memory itself, in order to avoid random bitflips.
Meanwhile in terms of new functionality for hardware developers, as part of ARMv8-R, Cortex-R52 implements support for hardware virtualization. Like virtually everything else in R52, this is deterministic as well, with the hypervisor working with the MPU to offer each guest OS its own section of the physical memory space. According to ARM this is a particularly important advancement, as previous means of separating tasks on real-time CPUs were non-deterministic, which is an obvious problem for the target market.
The significance of virtualization in a real-time processor is that it allows for multiple tasks to be executed on the R52 without interfering with each other. In large, complex devices (e.g. cars), this allows for fewer processors within the device, as these tasks can be consolidated onto a smaller number of processors. At the same time, the rigid separation between the tasks means that it’s possible to run both safety-critical and non-critical (but still real-time) tasks on an R52 together, knowing that the latter will not interrupt the safety-critical tasks. For cars and other devices where there is stringent safety certification, this is especially useful as it means that other tasks can be added (via their own guest OS) without invalidating the certifications of the safety-critical tasks.
This is also why ARM’s earlier context switching and interrupt entry improvements are so important. With a hypervisor now in play and multiple tasks executing on a single processor, the vastly improved ability to switch between tasks is critical for allowing multi-tasking without a major performance hit from context switching overhead.
Finally, for the potential market for the Cortex-R52, ARM is pushing the big three traditional markets for real-time and safety-critical processors; automotive, industrial, and medical. All three of these make significant use of real-time functionality, and there’s also a great deal of overlap on safety as well.
ARM is particular interested in the Advanced Driver Assistance Systems (ADAS) market, where the Cortex-R is part of a full system of ARM IP. A full ADAS setup from start to end would utilize all three processor types – M, R, and A – with the Cortex-R handling the real-time decision making and executing on those decisions, while Cortex-A would be used to handle sensor perception/interpretation, and Cortex-M would be in many of the individual sensors.
Wrapping things up, as with most other ARM IP announcements, the announcement of the Cortex-R52 is setting the stage for future products. ARM isn’t talking about specific customers at this time, but they already have a number of companies who have licensed ARMv8-R and will be in need of a CPU design to go with it. To that end, we should be seeing Cortex-R52 start appearing under the hood of various devices in the coming years.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Raqia - Monday, September 19, 2016 - linkDoes this CPU even have caches? What's its closest A-X equivalent in terms of architecture and performance?
nightbringer57 - Tuesday, September 20, 2016 - linkCache typically is a PITA to handle in real time applications.
This cannot really be compared to application processors, real-time processors typically sacrifice a lot of actual processing power in order to ensure a deterministic behaviour.
Tom Womack - Tuesday, September 20, 2016 - linkYes, it has caches (see slide 17).
extide - Wednesday, September 21, 2016 - linkIt's right in the article A7/A53
lilmoe - Monday, September 19, 2016 - linkA53 successor is way overdue, ARM...
extide - Tuesday, September 20, 2016 - linkA35 ..?
Nenad - Tuesday, September 20, 2016 - linkHow much slower is R compared to A ?
I understand that determinism is important benefit with R, but if A is order of magnitudes faster, is it possible that *worst* case for A would still be guaranteed to be faster than R ?
And if that is the case , would A be good or better than R even for RT systems (provided you run RT OS on top of A) ?
allajunaki - Tuesday, September 20, 2016 - linkRealtime, for example is how your Car ABS / Traction Control operates. If they miss the processing window, they will skip the event and move on to the next one. Time critical operations follow this method. What R series would allow you would be processing of this sort. Realtime operations require RTOS and a processor that supports Realtime operations. I would imagine these processors will also prioritise low latency over raw throughput. (So no Out Of Order Execution, cache, or anything that can potentially introduce latency).
MrCommunistGen - Tuesday, September 20, 2016 - linkSamsung has previously used ARM R4 cores in their SATA SSD controllers (MDX thru MHX). Wonder if R52 is slated to go into some future NVMe SSD controller!
Anato - Wednesday, September 21, 2016 - linkI doubt safety-critical certifications will allow non-certified code to be run in same processor at will. And I hope we don't need new pile of corpses to prove this. The savings aren't that magnificent that we should mix “random code” with safety-critical task and pray silicon is implemented correctly and doesn't have bugs.