Freescale (NXP) Kinetis KE04


The UART has a fixed 16x oversampling

The KE04 clocking system is simple but inflexible. Because most peripherals are clocked off the bus/flash clock, which is always at or below the core clock, it’s impossible to snooze the CPU — critical for high-power ARM cores.


Clocking on this microcontroller is basic: the primary clock source is an internal low-speed RC oscillator driving an FLL with a fixed frequency multiplier of 1280. This RC oscillator can be tuned between 31.25 and 37.5 kHz, resulting in a frequency output of 40-48 MHz. There’s a basic binary divider supporting 1-128x reduction in powers of 2. This hits the CLK_GEN module which has a natural number prescaler of 1, 2, 3, or 4.


Processor Expert provides a dizzying amount of configuration options; exposing essentially every peripheral function through some sort of configurable property-based panel.

Processor Expert

The biggest annoyance with the Freescale/NXP Kinetis line is that there’s essentially three entirely different sets of SDKs floating around for different generations (not families) of Kinetis parts. The oldest-generation parts were supported mainly by Processor Expert — a code generator that was originally part of Code Warrior (even before Code Warrior became Eclipse-based!).  Slightly newer parts had support from Kinetis SDK — a radical departure from Processor Expert that (for better or worse) brought the Kinetis software ecosystem in line with other manufacturers who provide run-time peripheral libraries. This SDK had optional Processor Expert support for generating the config structures passed to these peripheral libraries — but the newest parts are supported by Kinetis SDK 2.0, which has dropped processor expert support altogether. One of the main reasons I’m reviewing two different Kinetis ARM chips is because of these SDK changes. For more information on the new Kinetis SDK, check out the KL03 review.

With the KE04, we’ll be stuck using the KSDK-free “legacy” Processor Expert system. The big advantage with this system is that as soon as you learn the basics, you can essentially build up any project, ever, using any peripheral, without ever reading an MCU datasheet. I know that sounds, well, insane — but it’s not far from the truth.

Processor Expert uses a component-oriented model with dependency resolution; for example, two high-level “PWM” component instances will share a common “Timer Unit” low-level component (as they will end up on different output-compare channels of that timer unit). High-level components implement conceptual functionality, not peripheral functions. For example, if you wanted a function to execute every 25 ms, you would add a “TimerInt” component, and set the interval to 25 ms. Processor Expert will figure out which timer to use (FlexTimer, LPTimer, PIT, etc), route the clock appropriately, and calculate the necessary period register values, enable interrupts, generate an interrupt handler that takes care of any bits you need to set or clear.

If you understand the general use of microcontroller peripherals, that’s essentially all you need to know to program a microcontroller if you’re in Processor Expert. The AsyncSerial (UART) component lets you specify an RX and TX buffer size; Processor Expert generates callbacks alerting you to a character being received, as well as when the buffer is full. You can simply copy the data out of the buffer without thinking for a moment about testing and clearing interrupt flags, configuring UART FIFOs, or fussing with DMA. And many of these high-level components create additional functionality not part of the actual peripheral at all, or may only be present in some versions of the peripheral: for example, the ADC component can auto-scan through all the channels, even when the hardware ADC doesn’t support this capability.

Unlike some code-gen tools, like Silicon Labs’ Simplicity Configurator, Processor Expert generates everything: linker scripts, initialization code, runtime libraries, and callback functions. Unlike code-gen tools like Infineon DAVE CE, which generates code that calls into standard runtime peripheral libraries, Processor Expert generates specific API calls on request, with any initialization values pre-calculated as constants.  As an example, while some code-gen tools will generate a UART module that calls a standard runtime initialization routine with a human-readable baud rate, Processor Expert will generate code to directly write the correct baud rate generator values to the appropriate registers.

A lot of code-gen tools suffer from a loss of generality that makes them work well in toy example cases, but turn useless when deployed in real application scenarios — especially for projects that have low-power requirements, or have dynamic pin-muxing in use. However, Processor Expert supports essentially anything imaginable — multiple system configurations allow you to drift between different clock and run modes, and components can be explicitly configured to share pins with other components. If worse comes to worst, and you’re stuck with a component that needs to behave slightly differently, you can always disable re-generation at the individual component granularity — allowing you to modify a component to your liking.

While this all makes Processor Expert sound extremely fast and optimized, performance is actually somewhat mixed. On large processors with DMA, auto-scanning ADCs, FIFOs, and other advanced features, Processor Expert will transparently use these, which can provide a huge performance boost over runtime peripheral libraries that don’t always expose complex functionality like this to the end user. Unfortunately, some of the generated code simply makes you shake your head: it takes 40 cycles to toggle a GPIO pin, since the high-level “Bit” component calls into a low-level GPIO component, passing it a configuration structure and other unused parameters. Nothing is inlined, and even compiling with -O3 won’t eliminate these function calls. Of course, for performance-critical GPIO calls, it’s easy enough to use direct register calls. It’s tough to compare code size, but Processor Expert does seem to generate more efficient code than using runtime peripheral libraries (even when link-time optimization is enabled in the latter cases). In my testing, Processor Expert generated peripheral code that was close to half the size of the runtime peripheral library code for other MCUs (including Kinetis SDK on the KL03).

The 500-lb gorilla in the room, however, is this: if you’re new to Processor Expert, I think the first thing you’ll notice is how incredibly slow it is to use. I don’t mean “complicated” or “intricate” —  I mean that even on my 4.5 GHz 12-thread desktop, creating instances of components, switching views, changing values, generating code, and building projects takes forever. The entire system is single-threaded, and every time a property is changed, everything has to be re-evaluated. I’m not sure it could be made faster — because of how flexible Processor Expert is, almost everything has a huge dependency graph; and because almost everything is automated, the whole system has to solve for the proper register values from a near-infinite possible selection.

Having said all that, in my testing, it was “fast enough” to not be completely frustrating to use, and it’s one of the only development environments I tested where I was able to literally complete an entire project without even glancing at a datasheet for the microcontroller. That would be impressive enough on an 8-bitter, but on a modern Cortex-M0+ ARM microcontroller, with complex (“flexible”) peripherals, it’s downright incredible. It is, by far, the most complete, flexible code-generator tool I’ve ever used, too.

So who is it for? Honestly, peripheral configuration and bring-up is generally a drop in the bucket when compared to the time required to implement an entire commercial embedded project — but if you’re working on tiny projects (either hobbyist stuff, or simple proof-of-concept engineering demos), having a tool like Procesosr Expert around can get things working much more quickly than using runtime peripheral libraries; especially when you’re new to the Kinetis ecosystem, ARM microcontrollers, or MCU programming in general.

Kinetis Design Studio offers a nearly-stock Eclipse debugging environment.


Kinetis Design Studio is nearly stock

Even with an 8 MHz clock supplying the core and the bus,

DMX-512 Receiver

It took 218 clock pulses to complete a single byte-save-to-buffer routine in Processor Expert.

Using a 16 MHz core clock and an 8 MHz bus clock, the power consumption was 4.76 mA. A 12/12 clocking scheme yielded the lowest power at 3.77 mA (a major contributor is that the clock could be divided with BDIV instead of the Clock Gen’s integer divider). By looking at a logic trace, there’s room to further reduce the core/bus clocks, unfortunately, the UART (which needs a 4 MHz or multiple of) is chained to the flash clock, and there’s no way to get an 8 or 4 MHz flash clock with a 12 MHz core clock.