Give away medical masks when you place an order. learn more

Power Debugging ARM Cortex-M3 and Cortex-M4 Applications

Your application is now working properly, but you’re still over your power budget. This article shows you how to debug that problem, too.

Power debugging is a recent innovation that provides software developers with information about how the software implementation in an embedded system affects system-level power consumption. By coupling source code to power consumption, testing and tuning for power optimization is enabled.

Long battery lifetime is a very important factor for many embedded systems in almost any market segment--medical, consumer electronics, home automation, and many more. Power consumption has traditionally been a design goal that only the hardware developers have been able to influence. But in an active system, power consumption depends not only on the design of the hardware but also how it is used, which in turn is controlled by the system software.

The technology for power debugging is based on the ability to sample power consumption and correlate each sample with the program’s instruction sequence and hence with the source code. One difficulty is achieving high precision sampling. It would be ideal to be able to sample power consumption with the same frequency the system clock uses, but power system capacitances reduce the reliability of such measurements. Usually this is not a problem since from the software developer’s perspective, it is more interesting to correlate the power consumption with the source code and various events in the program execution rather than with individual instructions, so the resolution needed is much lower than one sample per instruction.

Typically within the debug probe, a resistor is connected in series with the supply to the development board. The voltage drop across this resistor is measured, fed to a differential amplifier and then sampled by an A/D converter.

The key to accurate power debugging is a good correlation between the instruction trace and the power samples. The best correlation can be achieved if a complete instruction trace is available, as is the case for ARM MCUs with ETM (Embedded Trace Macrocell) support. The drawback with using ETM is that it requires a special debug probe and ETM support in the device itself.

Figure 1: The debug module on Cortex-M3/M4.

Less accurate but still giving good correlation is to use the PC (Program Counter) sampling facility available in the ARM Cortex-M3/M4 cores (Figure 1). The DWT (Discrete Wavelet Transform) module implements the PC sampler; it samples the PC periodically around 10,000 times per second and triggers an ITM (Instrumentation Trace Macrocell) packet for each sample taken. The ITM is the formatter for events originating from the DWT. It packetizes the events and timestamps them. The debug probe (J-Link Ultra) samples the power consumption of the device using an A/D converter. By time stamping the sampled power values and the PC samples, the debugger is able to present power data on the same time axis as graphs like the interrupt log and variable plots as well as to correlate power data with the source code.

As stated before, power debugging is based on the ability to sample the power consumption and correlate each sample with the source code. For example, in IAR Embedded Workbench the power samples can be displayed in different formats. The Power Log window (Figure 2) is a log of all collected power samples. This window can be useful to find peaks in the power sampling, and since the power samples are correlated with the executed code it is possible to double-click on a value in the Power Log window and get to the corresponding code.

Figure 2: Power Log window.

Another way of viewing the power samples is via the Timeline window (Figure 3). In the Timeline window, the power samples are displayed on a time scale together with the call stack and up to four application variables that you can select.

Figure 3: Timeline window.

Power profiling

On a Cortex-M3 device the debugger can utilize the PC sampling possibility in the DWT module. This allows the debugger to sample the PC and provide statistical profiling. The profiler finds the function that correlates with the sampled PC value and builds an internal database of how often the functions are executed to generate function profiling information. The profiling information for each function in an application will be displayed in a debugger while the application is running.

Low-power mode diagnostics

Many embedded applications spend most of their time waiting for something to happen: receiving data on a serial port, watching an I/O pin change state, or waiting for a time delay to expire. If the processor is still running at full speed when it is idle, battery life is being consumed while very little is being accomplished. So in many applications, the microprocessor is only active during a very small amount of the total time and by placing it in a low-power mode during the idle time the battery life can be extended by orders of magnitude.

One approach is to have a task-oriented design and to use an RTOS; in a task-oriented design a task can be defined with the lowest priority and it will only run when there is no other task that needs to run. This idle task is the perfect place to implement power management. In practice every time the idle task is activated it puts the processor into a low-power mode. Many microprocessors and other silicon devices have a number of different low-power modes in which different parts of the processor can be turned off when they are not needed. The oscillator can, for example, be turned off or switched to a lower frequency; peripheral units and timers can be turned off; and the CPU then stops executing instructions. The different low-power modes have different power consumption rates based on which peripherals are left on.

Interrupt handling

In an event-driven system, for example, when the system is activated the power consumption increases as the MCU comes into active mode along with any peripheral devices. Once execution is suspended by an interrupt with a higher priority, any peripheral devices that were already active are not turned off, even though the thread with the higher priority is not using them. Instead more peripheral devices may be activated by the new thread, further pushing up consumption.

Even though system performance might be good, more optimizations can still be made in the power domain. Power debugging will make it easier to discover the extraordinary increase in power consumption that occurs when an interrupt hits and identifies it as abnormal. A closer examination of the Timeline window could have shown that unused peripheral devices were activated and consuming power for a longer period than necessary.

DMA versus polled I/O

DMA has traditionally been used to increase transfer speed. In the MCU world chip vendors have invented a plethora of DMA techniques to increase flexibility and speed and to lower power consumption. In some architectures the CPU can even be put into sleep mode during the DMA transfer. Power debugging allows the developer to experiment and see directly in the debugger what effects these DMA techniques will have compared to a traditional CPU-driven polled approach.

Finding conflicting hardware setups

To avoid floating inputs it is a common design practice to tie unused MCU I/O pins to ground. If the software by mistake configures one of the grounded I/O pins as a logical ‘1’ output, a current as high as 25 mA may be drained on that pin. This high unexpected current is easily observed by reading the current value from the power graph; it is also possible to find the corresponding erratic initialization code by looking at the power graph at application startup. A similar situation will arise if an I/O pin is designed to be an input and is driven by an external circuit, but the software incorrectly configures the input pin as output.

Waiting for device status

One common mistake that could cause unnecessary power to be consumed is to use a poll loop to wait for a status change of, for example, a peripheral device. While code constructions execute without interruption until the status value changes into the expected state. Another related code construction is the implementation of a software delay as a for or while loop.

In both of these situations the code could be changed to minimize the power consumption. Time delays are better implemented by using a hardware timer. The timer interrupt is set up and after that the CPU goes into a low-power mode until it is awakened by the interrupt. Also a polling of a device status change should be solved with interrupts if possible or by using a timer interrupt so that the CPU can sleep between the polls.

Depending on the characteristics of the embedded system it could be difficult to find these situations using power debugging. One way forward is to use the different power debugging windows to get to know the power profile of the application so that abnormal behavior can more easily be identified.

Finally, power debugging allows the developer to verify the power consumption as a factor of the clock frequency. A system that spends very little time in sleep mode at 50 MHz is expected to spend 50 percent of the time in sleep mode when running at 100 MHz. The power data in the debugger will allow the developer to verify expected behavior and if nonlinear dependency on the clock frequency exists, to choose the operating frequency that gives the lowest power consumption. Power consumption in a CMOS MCU is theoretically given by the formula:

where f is the clock frequency, U is the supply voltage and k is a constant.

Analog interference

Mixing analog and digital circuits on the same board has its own challenges. Board layout and routing become important in order to keep the analog noise levels at a low level in order to ensure accurate sampling of low-level analog signals. Doing a good mixed-signal design requires careful hardware considerations and skills. Software design can also affect the quality of the analog measurements. Performing a lot of I/O activity at the same time as sampling analog signals will cause many digital lines to toggle state at the same time – a candidate for introducing extra noise into the A/D converter.

Figure 4: Power spike due to stepper motor interfering with A/D-sampling.

Power debugging will help to identify interference from digital and power supply lines affecting the analog parts. Interrupt activity can easily be displayed in the Timeline window together with power data. Be sure to study the power graph right before the A/D converter interrupts. Power spikes in the vicinity of A/D conversions could be the source of noise and must be investigated. All data presented in the timeline window is correlated to the executed code; simply double-clicking on a suspicious power sample will bring up the corresponding C source code.


Power debugging techniques provide embedded developers the ability to understand the effect of source code on their application’s power consumption. By careful analysis of power “hot spots” and the review of programming methods used, engineers can make significant battery lifetime savings even during the early stages of project development.