Give away medical masks when you place an order. learn more
In the beginning there were cell phones, and they were good – well, allowing for the fact that they were the size and weight of bricks and could only do one thing: make calls. Today’s cellular handsets are the microcomputers of the 21st century, able to run an endless number of “apps”, stream high-definition video and high-quality audio, snap and process 12 megapixel pictures, and still make calls.
Moore’s Law notwithstanding, that is a lot to ask of a single processor, especially one that needs to operate from a small battery for an extended period of time. Handsets have long used separate applications processors to offload work from the main processor. However, with ARM’s recent introduction of its big.LITTLE approach – and NXP’s implementation of it in a low-power dual-core (M4/M0) embedded MCU – the move to asymmetrical multicore processors (AMPs) in other portable devices looks set to move quickly from niche to mainstream.
Texas Instruments’ OMAP and DaVinci
Texas Instruments’ OMAP™ SoCs have long been the dominant applications processors in cellular handsets. Since streaming video and audio are best handled by a DSP in the data path, OMAP SoCs all combine a general-purpose ARM® processor with a TI DSP.
The original 130 nm OMAP 1 family – such as the OMAP5912ZZG – paired a 192 MHz ARM926EJ-S™ with a TMS320C55x™ DSP core. The internal bus structure – one program bus, three data read buses, two data write buses, and additional buses for peripheral and DMA activity – efficiently enabled the DSP to perform up to three data reads and two data writes in a single cycle for relatively high-speed video and image processing.
Moving down to 65 nm, the OMAP 3 family increased speed significantly. The 600 MHz OMAP3530DZCBB, for example, upgraded the ARM926EJ-S to a Cortex™-A8; the C55x™ DSP to a TMS320C64x™; and added a POWERVR™ SGX graphics accelerator and a NEON™ SIMD coprocessor. While most of the OMAP 3 series of SoCs are sold directly to handset OEMs, the OMAP3530 is a catalog item targeting embedded developers. TI provides a series of OMAP3530 training videos on Hotenda’s web site.
Continuing to up the ante, TI’s 45 nm OMAP 4 platform moved to a dual-core ARM Cortex-A9 MPCore™ processor supporting symmetrical multiprocessing (SMP); switched to a programmable multimedia engine based on the C64x™ DSP; added an IVA 3 hardware accelerator; and upgraded to a POWERVR SGX540 3D graphics accelerator (see Figure 1). With high-end applications in mind, the OMAP4460 can deliver 1080p multi-standard video record and playback as well as stereoscopic 3D encode/decode. While TI utilizes just about every power management trick in the book with these chips, one should not expect to be able to do high-speed online 3D gaming all day long without recharging your cell phone. Developers wishing to evaluate the OMAP4460 should check out the popular SVTronics’ Pandaboard ES.
While TI’s OMAP 5 family – built around ARM’s dual-core Cortex-A15 and two Cortex-M4s – are clearly more interested in servers than cell phones, the OMAP-L138 – derived from the DaVinci™ family of video processors – moves back down the power curve with portable devices in mind. The OMAP-L138 utilizes an ARM926EJ-S RISC MPU and a TMS320C674x fixed/floating-point VLIW DSP running at a modest 375/456-MHz. In contrast to the DaVinci chips, the OMAP-L138 supports a wider, less video-specific set of peripherals and it includes a floating point DSP. TI markets an OMAP-L138 experimenter kit where you can check this out.
Figure 1: Texas Instruments’ OMAP44x block diagram (Courtesy of Texas Instruments).
If your project is more video oriented, the DM644 series of dual-core DaVinci DSPs may just fill the bill. The TMS320DM6446 includes an ARM926EJ-S core running at up to 405 MHz and a VLIW TMS320C64x+ DSP core running at up to 810 MHz. The chip is available for in-circuit testing on the DM6446 evaluation module.
Analog Devices’ Blackfin
Designed for low-power portable applications, Analog Devices’ Blackfin® is a venerable processor that acts as if it is a dual core – and some of them actually are. Co-developed with Intel, the Blackfin family consists of a wide range of small 16/32-bit RISC processors running anywhere from 300 to 600 MHz. The processor is based on SIMD architecture and features two 16-bit MACs, two 40-bit ALUs, and a flat address space. Each MAC can perform a 16-bit by 16-bit multiply in each cycle, and special instructions are included to accelerate various signal processing tasks; so Blackfin can perform control functions and simultaneously act as DSPs. ADI claims that Blackfin displays “best-in-class MHz/mW performance,” though that has become a hotly contested metric that everyone is chasing.
The ADSP-BF561SBBZ600 (see Figure 2) is a true dual-core device containing two 600 MHz Blackfin cores, each with two 16-bit MACs, two 40-bit ALUs, four 8-bit video ALUs, a 40-bit shift register, 128 Kbytes of low-latency on-chip L2 SRAM, and an external memory controller. This is a symmetric multiprocessor (SMP) device targeting a variety of multimedia, industrial, and telecommunications applications. ADI provides numerous product training modules on the Hotenda site, including a Blackfin processor core architecture overview, Blackfin system services, and Blackfin optimizations for performance and power consumption.
Figure 2: Analog Devices’ ADSP-BF561 functional block diagram (Courtesy of Analog Devices).
Freescale Semiconductors’ QorIQ
The Freescale QorIQ™ P1022 is an SMP processor built around two Power Architecture™ e500v2 cores that share a 256 Kbyte L2 cache (see Figure 3). With a clear emphasis on connectivity, the P1022 includes virtualized enhanced three-speed Ethernet with TCP/UDP/IP offload, direct FIFO mode for ASIC connectivity, SATA for local storage, support for three PCI Express interface options, plus the usual USB, SPI, multiple GPIOs, etc. The QorIQ P1022NSN2LFB runs at 1055 MHz and features a double-precision floating-point unit. The P1 Platform Overview training module provides an introduction to the processor family, and the P1022 Multicore Development System lets you get some hands on experience with the chip.
Figure 3: Freescale Semiconductor’s QorIQ P1022 block diagram (Courtesy of Freescale).
The latest entry into the low-power multicore market is NXP’s LPC4350, which it bills as “the world’s first dual-core DSC.” Following ARM’s “big.LITTLE” approach – the same one TI took with its OMAP 5 series using Cortex-M4s and Cortex-A15s – NXP combined Cortex-M4 and Cortex-M0 cores in the much lower power LPC4350.
In order to minimize power consumption, the 204 MHz LPC4350 uses the Cortex-M0 core to offload work from the Cortex-M4 whenever possible and the Cortex-M4 to burst data quickly as needed. Clearly targeting the embedded market, the LPC4350’s connectivity options include CAN, EBI/EMI, Ethernet, I²C, Microwire, SD/MMC, SPI, SSI, SSP, UART/USART, and USB OTG; built-in peripherals include brown-out detect/reset, DMA, I²S, LCD, motor control PWM, POR, PWM, and WDT (see Figure 4).
|Cortex-A15 vs Cortex-A7
|Cortex-A15 vs Cortex-A7