SPI code strategy, Cracking Contraptions

Need to do some thinking out loud… This coding stuff is getting a lot more complicated than I thought.

My long enforced layoff from coding perhaps had a silver lining? Before I left I was churning out code as fast as I could to get the ranging working. I didn’t manage to get it going, but I have left a pile of badly engineered, and undocumented code in my wake which I now find makes little sense. I decided to start off again by carefully reviewing everything I wrote, and taking the time to make sure I fully understand it all.

With TS06, I’m going to try to be more sophisticated in pursuit of power efficiency. It would be nice not to have to bother about this aspect yet, but I think I must. The problem is that I need to adopt a more reasonable real-time architecture to have any hope of improving battery life over TS05.

Old Scheme

TS05 (Atmel) handled everything in the simplest way possible – a giant polling loop. Two downsides of that were:

It spent a lot of time polling devices that had no data ready. That was wasting processor nap opportunities.
That seemed trivial at the time, especially when I was finding it such a challenge to get any response from the wide array of devices. In the light of all that experience I know that each of those polling operations involved sending commands between the processor and the device, at SPI, or I2C speeds. In the case of the SPI, all the devices were on a single bus, and the communication protocol sometimes had to be changed between devices that used incompatible communication protocols. That’s a pretty insignificant processor overhead.
Communication was at about 400Kb, and so a single average transaction took about 0.5 millisecond to discover that the peripheral had nothing to say! That’s not insignificant.

With about ten devices to poll, a single poll cycle took 5ms, which limited the maximum polling rate to 200Hz. This polling rate in itself was not a limitation – TS05 polled at about 10Hz, too slow for some devices, and way to fast for others. The 10Hz rate allowed samples to be collected at a regular rate which, to my mind at the time, made the signal processing easier to fathom. Stupidly, a lot of the time the processor was polling the clock to see if it was time to collect another round of samples. How dumb can you get?

This scheme meant that the the processor and all the devices were on and running for 5ms every second. As things worked out, the processor was almost never sleeping, and so the drain on power was substantial, and the huge battery would last for 1-2 days in continuous use.

New Scheme

TS06 (Nordic) needs to take a more modern, sophisticated, and one might say, standard approach.

The modern way to do things is to start with an almost trivial main loop:

for (;;)
{
sleep_till_interrupt();
run_scheduler();
}

In this scheme the processor’s prime strategic objective is to turn the power off to as many components as it can, and then go to sleep until something generates an interrupt.

Behind the scenes, when an interrupt occurs, the processor is woken up, and the Touchstone(TS) interrupt handle (IRQ) is automatically invoked by the hardware.

The interrupt handler does whatever it can to gather up and save the raw data pertaining to the interrupt. For an accelerometer, it might collect the x,y and z acceleration values, along with an accurate time stamp. Then it bundles up the data and posts it on the scheduler event queue: a queue of notifications that something significant has happened. TS maintains this queue behind-the-scenes. Finally the IRQ returns.

The main loop routine “sleep-till-interrupt” learns that an interrupt has occurred, and it returns control to the main loop, which immediately invokes the scheduler.

Now the scheduler takes each item of the head of the queue and calls a previously registered event handler to do the, possibly heavy, processing work required to transform the event data into some Touchstone activity. Normally this would simply involve transforming the data into some calibrated, standard form, and storing it, but it might eventually bubble up through the application layers to cause some alert message to be issued to the user, for example. Several interrupts may occur in quick succession, and so the scheduler works it’s way through them all before finally returning to the main loop, where the processor tries to go to sleep again.

Advantages of this approach

This approach has many advantages. Here are a few:

The processor, and perhaps a bunch of subsystems can be turned off until some interrupt occurs. Depending on what has been left running, the sleep current can be a few uA, or even less.
Events can be processed much sooner after they occur (lower latency).
Some interrupts can be processed without even waking the processor – button de-bouncing for example.
There is no pointless traffic on the SPI, and I2C bus which saves power.
Changes in sensor data can be handled at a higher rate.

Challenges

This approach is rather more difficult to implement, understand, and debug.

One of the challenges is that the whole architecture has to be designed as a state-machine. In other words, the current state of TS has to be succinctly described, so that it is possible to figure out what the right thing to do is for each possible interrupt. For example, some events have to be delayed until others have completely finished.

The other is that the standard routines that Nordic issues don’t consistently implement this model either!

And in particular

The generic SPI driver issued by Nordic is necessarily pretty complicated. Obviously I want to try and use their code as much as possible because it has been well tested. Unfortunately it is rather limited. In particular it is not written to send events to the scheduler.

I managed to implement the new architecture for the timer, RTC and UART, but their generic SPI drivers are much more of a challenge. So the question is, should I rewrite it entirely, or should I try to kludge a layer on top of it to handle it’s shortcomings?

I took the second approach to save time. I have managed to get an event stream working for the DW-SPI device, and it seems to be doing the right thing. Obviously the DW is not the only device on the SPI bus, but their driver pretty much assumes that it is the only device on the bus. If devices on the bus all use exactly the same protocol (POL, PHA, speed, and bitorder) then it might be possible to kludge it, but that’s probably wishful thinking in the long run.

So the question is, should I just bite the bullet at this stage, while I have my head in the code, do a decent engineering job, and make it handle multiple devices, or should I just forget other devices (like the uSD card) and move on to get the vital stuff done, and then come back. I’m afraid if I just leave it, then I’ll never have the stomach to come back and fix it.

What to do?

"They're techno trousers, ex-NASA, fantastic for walkies!"

SPI code strategy

Old Scheme

New Scheme

Advantages of this approach

Challenges

And in particular

Leave a Reply Cancel Reply