As MCUs proliferate into more and more power-constrained designs, processing efficiency becomes an increasing concern to system designers. The drive to get additional processing capability at less power is not only important in battery- operated products, but, increasingly, power is also constrained in many “plug-in” modules where power is limited by the module specification. In these cases, an efficient design will be able to deliver more features and create a competitive advantage over a less efficient implementation.
One overlooked aspect of MCU design is in the clock-control area. Designers can often create more efficient designs by using the advanced features of clock-control modules to adjust the clocking for the CPU, the peripherals, and other power-consuming resources. This article will explore the capabilities of some common features of advanced clock-control modules and show how their use can improve processing efficiency in most MCU-based designs.
The overall operation of the MCU is managed by a clock-control block and many clock-control blocks have features that can be used to select, control, and manage the clock sources for the CPU, memory, peripherals, and analog blocks. By controlling the clocks to these blocks, and even turning off the clock for features not being used during certain processing routines, power can be applied to just the sections of the MCU that need it and just when they need it, making for a very efficient implementation. The starting point for clock management and control are the various clock sources available for use by the various sub-modules. Understanding the advantages and disadvantages of each clock source is the starting point for an efficient MCU implementation.
Many common clock source and control capabilities are available in the STMicroelectronics ST32F37xx MCU clock-control block, illustrated in Figure 1. The ST32F37xx has multiple clock sources and each has a specific function for which it has been optimized. For example, the High-Speed External (HSE) clock, seen in the middle left of Figure 1, uses either an external crystal/ceramic resonator or an existing user-supplied clock. The external crystal/ceramic resonator can operate from 4 to 32 MHz and produces a very accurate main clock rate. If an existing external clock is already available, or if the user wishes to exercise additional control over the clock source (perhaps stopping or slowing the clock for finer low-power operation control), the external clock input can be used. This also frees up an additional GPIO pin since one less pin is required over the resonator implementation.
Figure 1: STM32F37xx clock-control block. (Courtesy of STMicroelectronics)
The High-Speed Internal (HIS) clock signal is generated from an internal 8 MHz RC oscillator and can be used directly as a system clock or divided by two prior to being used by the PLL. The HIS RC oscillator has the advantage of providing a clock source at low cost since it uses no external components. It also has a faster startup time than the HSE crystal oscillator; however, even with calibration the frequency is less accurate than an external crystal oscillator or ceramic resonator. Note that the CPU clock can only be driven by one of the high-speed clocks or from the output of the PLL.
The STM32F37xx also has two low-speed clock sources. The Low-Speed External (LSE) crystal/ceramic oscillator is shown at the top of Figure 1, and uses a high-accuracy 32.768 kHz resonator to create a precise clock source for the real-time clock (RTC) peripheral. The Low-Speed Internal RC oscillator provides a 40 kHz signal which drives the independent watch-dog timer and, optionally, the RTC when used for an auto-wake-up from the low-power Stop/Sleep mode. The low-power clocks operate independently from the high-speed clocks so their peripherals can operate even if the main system clock is turned off, saving significant power.
These clock sources are indicative of the typical sources you will find in most MCUs. The clock sources, as seen in Figure 1 are just the starting point for the clocking architecture in most MCUs. Usually the clocks can be further selected, processed (usually with one or more Phase Locked Loop, or PLL, blocks), or enabled all under MCU control. We will continue our exploration of clock control by looking at the typical capabilities of a PLL block in the next section.
Phase-Locked Loop operation
The various clock operation modules, like the Phase-Locked Loop (PLL) and the Frequency-Locked Loop (FLL) block, are the most important but typically less understood elements in a clock-control module. The Freescale Kinetis K10 sub-family of MCUs has a multipurpose-clock-generator (MCG) module, illustrated in Figure 2, which includes both a FLL and a PLL, so it is a good example to review. The Kinetis K10 MCG FLL is controllable by either an internal- or external-reference clock. The PLL is controllable by the external-reference clock. The module can select either of the FLL- or PLL-output clocks or either of the internal- or external-reference clocks as a source for the MCU-system clock. The MCG operates in conjunction with a crystal oscillator, which allows an external crystal, ceramic resonator, or another external-clock source to produce the external-reference clock.
Figure 2: Multipurpose-clock-generator block in the Freescale Kinetis K10 MCU family. (Courtesy of Freescale)
The FLL block, shown in the middle of Figure 2, takes the selected clock source and operates on it to create the desired clock frequency. The input clock can be divided by sixteen different values, from 1 to 1536, to create the base frequency of between 31.25 kHz to 39.0625 kHz. The base frequency is then multiplied by the digitally controlled oscillator (DCO) to create the desired output frequency from between 20 MHz and 96 MHz. The ability to select from a variety of frequencies by using the FLL makes it easy to adjust the clock rate for just the right frequency required for a particular mode of operation, improving processing efficiency. The PLL operates in a similar manner, but uses a voltage-controlled oscillator (VCO) to adjust the clock output frequency. The input clock to the PLL can be divided by a pre-scaler by a factor of from 2 to 25. A phase detector combines the PLL-input clock and the VCO-output clock to multiply the input signal by a factor from 24 to 55. The result of all this dividing and multiplying is a clocking signal that is the precise value needed for the application. Typically there is additional power required by the PLL over the FLL, but the PLL has finer adjustments and increased precision. Note that either the PLL or the FLL can be disabled to reduce power if they are not required to generate the frequencies needed by the application—for example, in the case where an external-clock signal supplies the exact frequency needed.
After the source clocks have been adjusted (if needed) by using a PLL or FLL it is up to the clock-distribution network to provide separate clocks for each of the key MCU blocks. The CPU and its associated memory system will need their own clocks, but typically the various peripherals will have their own clocks as well. The more finely divided the peripheral clock network is, the easier it is to individually select and control the clocks and to adjust the operation frequency to that required by the application. The usual way the clock network is controlled is to add clock-divider circuits (since it is usual for a peripheral to run slower than the CPU) and even clock-disable circuits, if the peripheral is not needed at all. The more control available the more power and performance can be adjusted, but at some point the additional die space (and its associated cost) and additional power required must be accounted for. Often manufacturers limit the clock network from providing complete flexibility, but attempt to provide enough capability for most applications to gain significant power and processing efficiency.
NXP, on its LPC15xx MCU, has segmented the clock dividers for each key peripheral function, as illustrated in Figure 3, along very functional lines, making it easy to determine what settings to use for each sub-block. For example, the clock signal to the USART blocks, shown in the middle right of Figure 3, are sourced from the main clock, but can be pre-divided (by from 1 to 255 or disabled completely) using the USART peripheral-clock divider. This clock signal can then be used by individual fractional-baud-rate generators (with 16-bits of clock divisor value available), within each USART block, to make it easy to set the baud rates of each USART individually. Other peripheral-clock dividers are available on a common function basis to make it easy to configure and control the performance, and thus the power dissipation of all the key peripheral blocks.
Figure 3: Peripheral-clock dividers on the NXP LPC15xx MCU. (Courtesy of NXP)
Grouping peripheral clocks along functional lines is a simple concept, but one that becomes more complicated when peripherals can operate in conjunction. For example, note that the ADC Clock Divider can be sourced from the SC Timer PLL. This seems like an unnecessary connection until you realize that the ADC can work in conjunction with the SC Timer to make periodic conversions without CPU intervention. The additional level of care in creating the flexibility required for desired operations, without overloading the clock control block with extraneous complexity, is the sign of a well-considered implementation.
Dynamic clock control
Once you have selected the clocks for the various peripherals in the most power- and processing-efficient manner, you may think that you are finished. For even more performance and power control you might want to also dynamically adjust the speed of some processing or peripheral blocks depending on the function or mode the MCU is in. Flexible-clock-selection networks are a key feature required to make use of dynamic clocking. A section of the Silicon Labs Gecko MCU clock-control block, shown in Figure 4, illustrates this point. The high-frequency clock switch on the upper left side of the figure can select from either of the high-frequency clocks (HFXO or HFRCO) or the-low frequency clocks (LFXO or LFRCO). This makes it possible to easily “mix and match” the low-frequency sources or high-frequency sources depending on the type of mode the device is operating in.
Figure 4: Section of the clock-control block of the Silicon Labs Gecko MCU. (Courtesy of Silicon Labs)
Dynamic clock selection can be most useful when going into and out of a traditional low-power mode would require too much “start-up” time (even the energy-efficient Gecko MCU Family can require 2 μsec wake-up for the EM2 low-power state and up to 160 μsec in the very-low-power EM4 state). By dynamically selecting a low-frequency clock, the power savings is less, but the fast-response time (typically on the order of just clock cycles) is much faster. This can make the difference between catching an asynchronous event or missing it. Having this level of clocking control can also augment the use of low-power states when a mid-range of processing performance is required during the operating state. Look for dynamic clocking-control opportunities in your design to achieve the most power- and processing-efficient levels.
Advanced clock-control modules offer some surprising capabilities that can help manage power use within an MCU. Controlling clocks to peripherals, managing the clock rate of the CPU, and dynamically changing the clock rate during processing are just some of the techniques that can be used to create more efficient designs.
For more information about the parts discussed in this article, use the links provided to access product pages on the Hotenda website.