Give away medical masks when you place an order. learn more

Integrated DSP and RISC Cores Set New Performance Ceiling

Microcontrollers (MCUs) have tackled low-end, digital signal processing (DSP)-centric applications for years and hardware multiply-accumulate (MAC) units in digital signal controllers (DSCs) have greatly expanded DSP capabilities. Still, MCUs that integrate peripherals and memory have fallen well below the performance levels afforded by dedicated DSP-centric processors and microprocessors that also integrate a secondary DSP core. The performance gap is shrinking, however, with multi-core MCU offerings from suppliers such as Texas Instruments (TI) and NXP Semiconductors. Let's explore the latest options for embedded designers that face DSP challenges, but that are also faced with system footprint and power concerns that would generally lead to an MCU-based approach.

Robust DSP-centric capabilities in MCUs goes back a decade at this point to the introduction of the first DSCs. Microchip coined the DSC term with the introduction of the dsPIC30 family that combined a 16-bit PIC24 MCU with hardware multiply-accumulate (MAC) capability and other DSP-centric functions such as a barrel shifter. TI approached the segment differently, leveraging processor technology from its DSP core heritage and integrating it into an MCU architecture with memory and peripherals in the C2000 family.

Multi-core MCUs

The DSC architectures mentioned above are single-core designs whether based on an MCU or DSP legacy. The latest TI and NXP DSP-centric offerings, however, are true multi-core designs. The intent in each case is to dedicate a core to the tasks for which it is best suited.

TI's new Concerto family such as the XF28M35H52C1RFPT combines a C28x DSP-centric core, including a floating-point unit (FPU), that is essentially evolved from the TMS320F283x Delfino MCU family with an ARM Cortex-M3 RISC core that the company has used in the Stellaris MCU family. In reality, designers that utilize Concerto get two MCUs in one as depicted in the block diagram (Figure 1). Each core has its own dedicated set of memory and peripherals. There is also a block of shared resources that support power clock and clock distribution, and interprocessor communications, and implement basic analog peripherals.

Figure 1: TI's Concerto MCU includes separate RISC and DSP-centric cores, each with a dedicated set of peripheral and memory resources.

NXP takes a slightly different approach in the LPC43xx MCU family depicted in Figure 2. For starters, both CPUs are based on ARM cores. The family doesn't use a homogeneous multi-core approach where the cores are identical. Instead, DSP capabilities are centered in the Cortex-M4 core that includes a MAC, SIMD (single instruction multiple data) execution unit, and an FPU. Meanwhile the Cortex-M0 core is ARM's baseline 32-bit RISC offering for MCUs. The NXP architecture provides a single set of memory and peripheral functionality that is shared by the cores, although a design can dedicate specific memory blocks and peripherals to a single core.

From a programming perspective, embedded designers face a different set of challenges with the TI and NXP multi-core MCUs. In the case of NXP, the cores are largely compatible from the instruction-set perspective with the exception that only the Cortex-M4 supports the math instructions. That should simplify application development and the process of parceling out tasks to the two cores.

Figure 2: NXP combines ARM-Cortex-M0 and -M4 cores in the LPC43xx family, and the cores share a single set of memory and peripheral resources.

In the case of Concerto, the cores have completely different instruction sets. But TI says its development tools mitigate any complications that the heterogeneous cores introduce. The company supplies versions of the ControlSUITE integrated development environment (IDE) for both cores providing the development team with a unified window into development. Programming is typically done in a high-level language and ControlSUITE supports dual-core debug features. View the ControlSUITE Product Training Module on Hotenda’s website for more information.

Control tasks consume DSCs

Of course, it's fair to ask the question why we need MCUs with two cores. The MCU segment is very different from the general-purpose microprocessor segment. In the latter, multiple homogenous cores both accelerate multi-threaded applications and increase the aggregate processing capability of a single microprocessor.

In the MCU case, the real-time control-loop processing requirements of an application typically dictate the processor choice and such control loops can't generally be spread across multiple cores. The multi-core MCUs will most often dedicate the DSP-centric core to the control loop and the general-purpose core to system management tasks.

There are certainly many deployed examples where a legacy DSC handles both the control-loop processing while also handling system-management functions and communication interfaces. TI says, however, that a substantial user base of its C2000 DSCs combine the ICs with a general-purpose MCU. That decision is made because system-management tasks would limit the fidelity with which the DSC can handle real-time control.

Let's consider an example that will illustrate the need for two cores along with highlighting some other Concerto features. The C28x core in a Concerto MCU excels at tasks such as motor control. High-resolution PWM peripherals support the application. Meanwhile some motor-control applications also require specialized communications such as a power-line modem. The Cortex-M3 CPU can handle high-level communication functions but would need the C28x core to handle the modem function. And the combination of motor-control and modem algorithms would prohibit the C28x core from acting as a system manager.

The C28x core used in Concerto includes a hardware block called the VCU (Viterbi math-Complex Unit) that TI has also supplied on some other recently-announced C2000 MCUs. Figure 3 depicts a Viterbi decoding chain that might be used in a power-line modem. Implementing the algorithm on the VCU results in 25-times greater performance than is achievable using a software implementation on the C28x. It turns out that dual-core designs don't just enable applications that can't be implemented in a single-core device, but also enable applications that may have previously required a higher-end microprocessor and DSP processor combination.

Figure 3: For communication-centric applications such as power-line modems, Concerto integrates the VCU (Viterbi math-Complex Unit) to accelerate applications such as Viterbi decoders.

Indeed the evolving capabilities of multi-core-enabled MCUs will both enable new classes of applications and present new challenges to the design team. Applications such as power-line communications are decidedly complex. TI offers help through both DSP kernel libraries and higher-level application libraries. For example, the company offers application libraries for motor control, digital-power control, power-line communications, and other functions.

TI also has an established methodology of supplying development tools and kits that allows design teams to easily experiment with new C2000-based MCUs. The company supplies what it calls ControlCARDs for each MCU in the C2000 family. The ControlCARD hosts the processor and provides access to all of the MCU signals via a standardized connector. Design teams can develop a single application board with a ControlCARD connector, and evaluate a number of C2000 MCUs in the target application. TI also offers Experimenter Kits that combine a ControlCARD and a general-purpose base board. For Concerto, TI offers the TMDXCNCDH52C1 ControlCARD and the TMDXDOCKH52C1 Experimented Kit.

Hotenda also hosts some Product Training Modules (PTMs) that designers may want to peruse if they are new to Concerto or to the C2000 family in general. The Concerto PTM is focused on the new multi-core MCUs and the previously mentioned ControlSUITE PTM covers the development tools used across the C2000 MCU family.


The next time you face a performance-intensive design challenge, make sure you consider the emerging trend of MCUs that integrate multiple cores. The trend is sure to escalate just as Moore's Law has driven the microprocessor segment with products that mix homogenous and heterogeneous cores. In the MCU space, you can expect designs that focus on cores that target specific elements of an application. In the case of TI and NXP, you get a RISC core that excels as a system manager and a DSP-centric core to handle granular control loops and real-time response. And programming such devices may be simpler than you first expect.