2023 03 25 Raven Status Update

A fair amount of progress as been made on Raven over the last few days.

I have list of TODO items that need to be tackled. One of those items has been on the list or months and this week it bubbled to the top. My proximity sensor values were being time stamped with old values.

In ROS (Robot Operating System), the whole software stack that tries to generate commands to move the robot somewhere interesting relies on a flood of data coming in, like SONAR readings, time of flight readings and LIDAR readings. Each of those readings comes with a timestamp indicating when the reading was taken.

On my robot, the readings are made on a custom PC board I made, and then sent to the main computer of the robot. The custom board needs to have the same sense of “now”-ness as the main PC — the two computers have to agree within a few milliseconds on what time it is now. It doesn’t do any good to take a SONAR reading and then have it show up a half second late to the main computer. The algorithms all have a small “tolerance” for time skew, and if sensor readings are too old, they are rejected out of hand as useless.

The software piece I use to transmit my sensor values from the custom board to the main computer is a package called “Micro ROS”. I use this software with some small customization as my custom board has a lot more capability to it that the usual, intended board using Micro ROS. 

One function built into Micro ROS is the ability to ask another computer what time it is, and set its own clock to match that other computer. But setting it just once doesn’t work. Each computer has a clock driven by a crystal-controlled oscillator, and crystals drift as they heat up. Worse, much worse, the CPU clock chip in my custom board seems to “burp” now and then when interrupt signals come in and my hardware generates a fair number of interrupt signals.

Another problem is that Micro ROS has baked into it an expectation that the main computer is using a particular version of communication software, and the expected version currently has bugs which cause my stereo cameras to not operate correctly. It took a bit of reading for me to realize that little factoid.

For the moment, I can ignore that, so I set my main computer back to using the buggy communication software. Also, when Micro ROS asks for the current time, it takes “quite a bit of time” to get the result, usually about 150 milliseconds, but sometimes as much as a second. So any answer it gets from the main PC will be inherently wrong right away.

My last few days of programming have been devoted to finding someway to make that all work with an allowable tolerance for errors in timing. I tried over and over, and I’m normally very good at finding creative solutions to tricky problems. Still my code got progressively worse the more I tried to fix the problem. And then, my robot buddy Ralph Hipps called for one of our at-least-daily robot chats and in the process of explaining the problem to him, it occurred to me what the root cause was. My custom board was attempting to do a complex operation during interrupt handling.

Interrupt handlers on a computer must be very quick. If your interrupt handler code takes more than about a millisecond, sometimes even only a few tens of microseconds, everything falls apart. And because I was explaining the symptoms to someone else, I finally realized that I was taking probably tens of milliseconds in my interrupt handler for the SONAR sensors.

Once I realized that, the fix was easy. The details aren’t too important, but I stashed the important information from the signal in a shared, global location and exited the interrupt handler. Then the normal, non-interrupt code picked up the stashed information for processing. Outside of the interrupt handler, you can largely take whatever time you want to process data. Sort of. ROS navigation depends heavily on high “frame rates” from sensors. If you can’t generate, say, LIDAR data 10 times a second or faster, you have to slow your robot way down, otherwise the robot would move faster than it could react to, say, the cat rearing up in front of your robot with nunchucks in hand.

The robot is now happily sending sensor data nearly three times faster than before, about 80 frames a second and rarely gets the time out of sync by very much. Now I can move on to the new problems that have shown up because I fixed this problem.

Below is an obligatory snap shot of a part of the visualization software showing my robot, in a room, with LIDAR, SONAR and time of flight sensors showing as well. No cat is in this picture.

Everything about robots is hard (TM)