Understand Power and Performance Trade-offs for Efficient MCU Designs

MCUs are used as the main control element in just about every application imaginable. Their power and flexibility make them the go-to component at the heart of most designs. The key to creating efficient designs using MCUs often relies on making intelligent trade-offs between power and performance. Many MCUs provide several options for powering the MCU that can restrict the MCU clock rate and thus, its performance. Understanding the common relationships between the operating voltage and the operating clock rate can be critical to getting the most out of your next MCU design.

This article will quickly review some of the common options for powering MCUs and discuss the resulting constraints on performance that may result. Common techniques for modifying the operating voltage at run time to obtain the optimal mix of performance and power efficiency will be explored to assist you in selecting and implementing your next MCU-based design.

Frequency vs. operating voltage – a key performance consideration

One of the most fundamental relationships between performance and power is the voltage at which the MCU operates. Operating power is directly related to operating voltage (by definition since power equals voltage times current), so clearly the operating power needs to be a key consideration when deciding on the MCU you will use for your design. You might think that this means you should always use the lowest power MCU, but if performance is at all an issue in your design you will need to consider the operating frequency as a key element as well, and an MCU’s operating frequency is often limited by its operating voltage.

Many MCU manufacturers understand the importance of the relationship between operating voltage, frequency of operation, MCU performance, and MCU operating power and they provide different levels of operating power and operating frequency to make it easier to optimize the design for a best fit to your system requirements. As an example, the Renesas RL78 MCU has four different operating voltage ranges, each supporting a different operating frequency, as illustrated in Figure 1 below. Between 1.6 V and 1.8 V, the RL78 can run anywhere between 1 MHz and 4 MHz. Between 2.7 V and 5.5 V, it can run at a maximum of 20 MHz. Thus, the RL78 can operate five times faster if it uses 2.7 V instead of 1.8 V, only a 50% increase in the operating voltage.

Figure 1: Voltage vs. Frequency Chart for Renesas RL78 MCU. (Courtesy of Renesas)

The above described relationship of improved power efficiency when operating at higher voltages is common in many MCUs and is one of the most important relationships to understand when power efficiency is a key requirement in your design. In many cases, it is much more power efficient to keep an MCU in the lowest possible power state, perhaps a low-sleep mode, and to wake it up when it needs to do some processing (perhaps sampling a sensor to see if further action needs to be taken). When processing is required it is typically more power efficient to run at a faster frequency to minimize the time in the higher power state. If the processing can be done five times as fast, and it only requires an operating power increase of 50%, (as in the case of the RL78) you can clearly see that the overall energy required will be much less and thus it will be a much more power-efficient design.

Clocking control

The frequency of operation of the MCU is managed by a clock-control block; and many clock-control blocks have features that can be used to select, control, and manage the clock sources for the CPU, memory, peripherals, and analog blocks. By controlling the clock frequency to these blocks, and even turning off the clock to features not being used during certain processing routines, the amount of dynamic current (current needed to change the voltage level of a signal or a storage element) can be modulated so you are using current in the most efficient manner. (Note that battery-based applications, in particular, are the most current-conscious designs since it is the overall current supplied from the battery that is typically the most critical constraint.)

Many of the most common and useful clock-control functions are contained in the Microchip PIC32MX MCU clock-control block, illustrated in Figure 2. The starting points for most clock-control modules are the clocking sources, and by having multiple sources it makes it possible to optimize the clocking for multiple modules independently. For example, the PIC32MX has a low-power internal RC oscillator (LPRC near the bottom of Figure 2) that can be used when very-low-speed operation is acceptable. It sources the watch-dog timer (WDT) so that even in very-low-power modes this critical timer can still be used. The primary oscillator (POSC) uses an external crystal to generate a precise high-speed clock source for use by the highest-performance sections of the device and feeds the system and USB PLLs (at the top of the diagram). Note that separate PLLs also mean that the USB operation can be independent from the system clock, providing an additional level of clocking optimization and potential power savings. The fast RC oscillator (FRC) provides an 8 MHz clock source if the external oscillator is not required, saving board space, component count, and perhaps saving power when the highest frequency and precision is not needed. Finally, a secondary oscillator is available for low-power operation from an external 32 KHz crystal.

Figure 2: Clock-control block in Microchip PIC32MX1xx Family. (Courtesy of Microchip)

This wealth of clock sources can be selected and further divided by post-scalers, pre-scalers, and the two main PLLs to generate the frequencies needed for the various sub-sections of the device. The fixed divide by 16 blocks and the selectable FRC post-scale divider (controlled by the FRCDIV inputs) create the main clock for the CPU and peripherals (SYSCLK). The peripheral clock can be further divided by an additional post-scaler to optimize peripheral clocking speeds, minimizing dynamic current generated within these functions. Many of the clocking options shown in Figure 2 can be controlled via configuration registers or are automatically selected depending on the performance level desired by the programmer. Modern MCUs are making it easier to operate even the most complex clock managers via simple application program interface (API) calls that simplify and reduce potential conflicts when configuring these blocks “by hand.” Look for these APIs in MCU manufacturer literature, software tool-based configuration wizards, code examples, and reference designs to simplify your design process.

Flash memory performance and clock frequency

One area often overlooked when selecting an MCU is the performance of the code-flash memory. Some MCUs have fast CPU cycle times, but these fast operation speeds can be limited by the access time of code or data stored in flash memory. For example, in the Atmel AT32UC MCU the flash-cycle time is related to the operating frequency as shown in Figure 3 below. The number of flash wait states (FWS) is zero when operating at 33 MHz and thus the read-access time just requires one cycle. At 66 MHz operating frequency the flash memory inserts a FWS, so the access time requires two cycles. Therefore, you might expect to end up with an effective operating frequency of 33 MHz, even when running with a 66 MHz clock. MCU manufacturers have developed several approaches to mitigate the insertion of a wait state however, so that you typically pay much less than the full overhead.

Figure 3: Flash wait states for the Atmel AVR AT32UC MCU. (Courtesy of Atmel)

One approach to mitigating flash wait states is to pipeline the flash interface, and this is the approach taken in the Atmel AT32UC MCU. This pipelined approach allows burst reads from sequential memory locations (by far the vast majority of code-memory accesses are sequential since you typically just continue to the next instruction) without a read penalty. This results in an average overhead of only 15% in effective cycle time, not the full 100% you might otherwise expect. Another common approach to mitigate slow-flash access is to use local-memory caching so that repetitive accesses can use already fetched data, and not require a full read to the slower flash-memory block. You should always look carefully at the interaction between flash-memory access and overall processing performance in your design to determine the effect your selected clock speed will have on overall processing performance.

Just turn it off

One of the most power-efficient modes of operation for an MCU is to just turn the device off completely resulting in zero power dissipation. The MCU performance is also effectively zero as well, so this might not seem to be a very useful approach if you need to make sure some minimal amount of operation always takes place. For example, you might require a real-time clock to keep an accurate time record even while the device is turned off. Luckily, some MCUs can run in a battery-backup mode so that simple operations can continue even if the rest of the device is powered off.

Texas Instruments provides just such a capability on the MSP430x5xx/6xx MCU family. As illustrated in Figure 4 below, the battery-backup block supplies a subsystem from a secondary supply (VBAT) if the primary supply (DVCC) fails. The backup-supplied subsystem usually contains a real-time clock module (together with the required LF-crystal oscillator) and a backup RAM. The various operations of the block are controlled by register bits (signals with the “BAK” prefix) so that charging, selecting, and ADC operations can all be managed by the processor. When an RTC and backup SRAM are required they can run off the battery voltage and the RTC can even be used to “turn on” the rest of the MCU for periodic operations. This saves the maximum amount of power and creates a very space-efficient and power-efficient control system, completely turning off (running at zero frequency) the CPU while dissipating very minimal (virtually zero) system power—perhaps the ultimate in power performance trade-offs.

Figure 4: Battery backup sub-system on the Texas Instruments MSP430x5xx/6xx MCU Family. (Courtesy of Texas Instruments)


The key to creating efficient designs using MCUs often relies on making intelligent trade-offs between power and performance. Many MCUs provide several options for powering the MCU that can restrict the MCU clock rate and thus, the performance. Managing clocking, selecting the right operating voltage level, and understanding the relationship between voltage level and flash performance are all critical to creating the most power-efficient MCU design possible.

For more information on the parts discussed in this article, use the links provided to access product information pages on the Hotenda website.