Audio capture and playback are becoming a requisite in many microcontroller (MCU)-based applications. However, the range of audio support in terms of fidelity and codecs is incredibly broad. You can host audio-enabled applications with platforms based on simple 8-bit MCUs, but quality audio may require a digital signal controller (DSC) or 32-bit MCU. This article will survey the breadth of the audio space, suggesting potential applications that match varying levels of MCU performance, and pointing out the readily available evaluation kits that can help you get started in a project with an audio element.
Let's start by having a look at what you can accomplish with an 8-bit MCU. In the past, adding speech recording and playback capability to a product meant using a digital signal processor or a specialized audio chip. Now, Microchip Technology has published an application note focused on the use of adaptive differential pulse code modulation (ADPCM) to handle simple speech encoding and decoding on an 8-bit PIC18F67J10 MCU. ADPCM encoding is based on the fact that successive speech samples are highly correlated. The algorithm predicts each succeeding sample based on the prior one, and only encodes the difference between the predicted and actual samples. You certainly would not use ADPCM for encoding music, but the algorithm is very effective in speech applications.
You will find ADPCM implementations that are based on floating-point math and precision data converters. Such an implementation is clearly beyond the capabilities on an 8-bit MCU. Microchip developed an implementation based on 4-bit ADPCM data. The 8-bit MCU can support monotone audio with an 8 KHz sample rate.
The design of the encoder (Figure 1) accepts a stream of 16-bit data in two's complement format. You can use the on-chip 10-bit A/D converter (ADC) to encode samples from a microphone. The decoder takes the 4-bit ADPCM data and generates a 16-bit two’s complement output. You can use the on-chip Capture/Compare/PWM (CCP) peripheral to drive a PWM signal to an output filter.
Figure 1: ADPCM encoder block diagram, where sp is the predicted sample, si is the linear input sample, d is the difference, and t is the 4-bit ADPCM value.
There are a number of factors that may limit the performance of such an application on an 8-bit MCU, although generally the CPU performance is not a major one. For example, the conversion speed of the integrated ADC and the write speed of the flash memory limit the implementation to an 8 KHz rate. Indeed Microchip says that the speech capability can be realized on a lower-performance 8-bit PIC16 family MCUs. The ADPCM application easily fits within the memory footprint of the PIC18F67J10 MCU. The decompression algorithm, for instance uses only 484 bytes out of 128 Kbytes available for program storage.
Microchip does not offer an audio-centric development kit for PIC18 family MCUs, but you can easily piece one together. The PICDEM development board includes both a PIC18 MCU and a dsPIC30F DSP-enabled MCU or digital signal controller (DSC). Add a Speech Playback PICtail Plus Daughter Board to the kit and you are all set for audio experimentation.
If you take the encoding portion of the task out of the equation, then 8-bit MCUs are far more capable of audio tasks. For example, you might design a product that plays back prerecorded fragments of speech as voice prompts for the end user. You create the sample separately and just use the MCU to decode the data and output a PWM signal.
16-bit audio application
Moving up to 16-bit MCUs, you might expect a significant jump in the type of audio applications that you might target. In reality, however, the difference is not as substantial as you might think. As we just discussed, characteristics such as data-conversion time and memory-access speed gate the audio capability. What you do get is additional free MCU cycles to host other elements of the application at hand.
Let's consider Renesas' newest 16-bit MCU series – the RL78 family. This MCU family is optimized for low-power applications including use in battery-powered devices. The design delivers 41 Dhrystone MIPS (DMIPS) with a maximum clock speed of 32 MHz. That performance level places it on the heels of low-end 32-bit MCUs, and in fact is faster than some.
Renesas has published an application note focused on using the RL78 in ADPCM applications. The encoder uses the integrated 10-bit A/D converter to sample the input. Like the Microchip example we discussed previously, the implementation creates 4-bit ADPCM data at an 8 KHz sample rate. The decoder can operate at a sample rate of 11.025 KHz.
You can experiment with Renesas' ADPCM application using the RSK RL78/G13 Developers Kit (Figure 2). The kit is not audio specific, but it does integrate a number of audio-centric features, including both monotone and stereo audio amplifiers. The board also includes a microphone input and preamp along with an interface for a digital microphone.
Figure 2: The Renesas Development Kit for the low-power RL78 MCU includes monotone and stereo amplifiers and a microphone input.
The other benefit that you get by moving to a 16-bit MCU is a broader selection of encoding algorithms that are more CPU intensive and that can deliver either better audio quality, or higher levels of compression meaning you can store more audio in the available memory.
For example, let's consider the Microchip PIC24 family of MCUs. For encode and decode applications, Microchip offers support for ADPCM, G.711, G.726A, and Speex codecs. There are actually more codec choices, but the ones listed are available on a royalty-free basis.
G.711 is an ITU (International Telecommunication Union) standard that is widely used in telephony applications. The standard specifies 8-bit samples, an 8 KHz sampling rate, and uses a PCM algorithm.
G.726A is also an ITU standard and is based on ADPCM. The standard specifies an 8 KHz sampling rate but provides flexibility in terms of sample size and offers a choice of 16, 24, 32, or 40 Kbit/s data rates.
Speex is an open-source codec and was developed for voice-over-IP (VoIP) applications. The codec is based on the code-excited linear prediction (CELP) algorithm. The codec can support 8, 16, and 32 KHz sampling rates.
Generally, G.711 offers the best quality of the options we have discussed. Microchip says it takes on the order of 60 MIPS, relative to a PIC24 MCU, to implement. Depending on the encoding options selected, the G.726A codec can require 16 to 40 MIPS. The Speex codec can in some cases match G.726A in terms of quality, and requires less than 16 MIPS.
According to Microchip, the G.711 codec requires 8 Kbytes to store 1 second of speech. The requirements for the G.726A codec range from 2 to 5 Kbytes to store one second. Meanwhile, the Speex codec requires only 1 Kbyte to store one second of speech.
Adding DSP capability
Microchip actually groups the PCI24 family with the dsPIC33 DSC family (Figure 3) because the ICs share the same CPU architecture, although the latter adds math support for DSP applications. In terms of audio applications, it is interesting to see what you add by moving to the DSC.
The dsPIC33 does not add much in terms of the codecs supported, although, again, you will free up CPU cycles that can be used for other facets of an application. However, the DSC allows you to use Microchip's Automatic Gain Control Library, which automatically adjusts the amplitude of a speech signal prior to the encoding process. The capability is especially helpful in applications where the distance between a speaker and a microphone varies such as in speaker phones.
The Microchip dsPIC33 ICs also can use the company's Speech and Audio Fast Forward tool. The design team uses the tool during the development process to have real-time control over audio-centric algorithms such as noise suppression, echo cancellation, and equalization. The prior features are also implemented in libraries. Additionally, the GUI-based speech tool generates code that can be ported to PIC33 DSCs.
Microchip offers a comprehensive audio-centric development toolset (Figure 3) for use in PIC24- and dsPIC33-based designs. The Explorer 16 general-purpose development board supports both of the MCU families. You can add the audio support via the Audio PICtail Plus daughter card and software that ships with that product. The combination supports 16- and 24-bit audio, includes 4 Mbits of serial flash memory to store audio, and includes a low-pass filter to demodulate the PWM output from the MCU.
Figure 3: For the dsPIC MCU families, Microchip offers a robust set of audio-centric libraries that can be managed by the GUI-based Speech and Audio Fast Forward development tool.
32 bits and music
Now let's move on to the 32-bit space. As you may have expected, music enters the picture with 32-bit MCUs. Generally, MCUs are not capable of encoding music into a format such as MP3 or WMA (Windows Media Audio) in real time. But 32-bit MCUs can handle flawless music decode, along with all of the audio applications we discussed previously. If you want to implement encoding, you will need to turn to a dedicated codec IC.
When you move into the music area you generally move beyond the capabilities of on-chip peripherals to generate the required audio quality. The combination of an MCU and DAC can handle 16 to 24 bit audio with 32 to 48 KHz sample rates. You will also see audio-centric MCU offerings once you consider the 32-bit space. For example, Atmel offers the AT32UC3 family of 32-bit MCUs in both general-purpose and audio-specific versions. The products are based on the AVR MCU core.
One example of an audio MCU is the AT32UC3A0512AU MCU that integrates 512 Kbytes of flash memory and 64 Kbytes of RAM. The audio MCUs carry identification numbers that are required for the devices to execute licensed algorithms such as MP3, WMA, and AAC decoders. The MCUs integrate a complete feature set that would be required in a portable music player, such as support for flash storage cards and a robust USB stack.
Microchip also supports music applications on its 32-bit PIC32 MCU family that is based on a MIPS core. The 32-bit MCUs do not support the gain-control library or the Speech and Audio Fast Forward development tools that are available for the DSPIC33. However, the 32-bit offerings support all of the other codecs we have discussed here relative to Microchip MCUs.
As you might expect, Microchip offers a number of development tools for the 32-bit MCUs that will come in handy in audio and music projects. The PIC32 audio development board (Figure 4) integrates a PIC32MX795F512 MCU with 512 Kbytes of flash and 128 Kbytes of SRAM. The board also includes a Wolfson codec that can handle real-time music encoding and decoding. Microchip also supports decoding on the PIC32 via the open-source Helix MP3 Decoder Library
Figure 4: Microchip's PIC32-based audio development board is paired with the iPod PICtail that includes a docking connection for an Apple iPod.
The development board includes a connector that is compatible with Apple's MFi interface that is used on the iPod. Microchip also offers a companion product called the iPod PICtail Plus that includes a dock for an iPod.
As you can see, adding audio features to an MCU-based system design is relatively straightforward given the breadth of tools and libraries available from the MCU vendors. You must approach such a design with a realistic expectation about the audio quality that a given class of MCU can support. You will find that even very low-end MCUs can handle playback of short audio clips. As you move up in processing power, you can add encoding, and ultimately support, for music.