2023 03 25 Raven Status Update

A fair amount of progress as been made on Raven over the last few days.

I have list of TODO items that need to be tackled. One of those items has been on the list or months and this week it bubbled to the top. My proximity sensor values were being time stamped with old values.

In ROS (Robot Operating System), the whole software stack that tries to generate commands to move the robot somewhere interesting relies on a flood of data coming in, like SONAR readings, time of flight readings and LIDAR readings. Each of those readings comes with a timestamp indicating when the reading was taken.

On my robot, the readings are made on a custom PC board I made, and then sent to the main computer of the robot. The custom board needs to have the same sense of “now”-ness as the main PC — the two computers have to agree within a few milliseconds on what time it is now. It doesn’t do any good to take a SONAR reading and then have it show up a half second late to the main computer. The algorithms all have a small “tolerance” for time skew, and if sensor readings are too old, they are rejected out of hand as useless.

The software piece I use to transmit my sensor values from the custom board to the main computer is a package called “Micro ROS”. I use this software with some small customization as my custom board has a lot more capability to it that the usual, intended board using Micro ROS. 

One function built into Micro ROS is the ability to ask another computer what time it is, and set its own clock to match that other computer. But setting it just once doesn’t work. Each computer has a clock driven by a crystal-controlled oscillator, and crystals drift as they heat up. Worse, much worse, the CPU clock chip in my custom board seems to “burp” now and then when interrupt signals come in and my hardware generates a fair number of interrupt signals.

Another problem is that Micro ROS has baked into it an expectation that the main computer is using a particular version of communication software, and the expected version currently has bugs which cause my stereo cameras to not operate correctly. It took a bit of reading for me to realize that little factoid.

For the moment, I can ignore that, so I set my main computer back to using the buggy communication software. Also, when Micro ROS asks for the current time, it takes “quite a bit of time” to get the result, usually about 150 milliseconds, but sometimes as much as a second. So any answer it gets from the main PC will be inherently wrong right away.

My last few days of programming have been devoted to finding someway to make that all work with an allowable tolerance for errors in timing. I tried over and over, and I’m normally very good at finding creative solutions to tricky problems. Still my code got progressively worse the more I tried to fix the problem. And then, my robot buddy Ralph Hipps called for one of our at-least-daily robot chats and in the process of explaining the problem to him, it occurred to me what the root cause was. My custom board was attempting to do a complex operation during interrupt handling.

Interrupt handlers on a computer must be very quick. If your interrupt handler code takes more than about a millisecond, sometimes even only a few tens of microseconds, everything falls apart. And because I was explaining the symptoms to someone else, I finally realized that I was taking probably tens of milliseconds in my interrupt handler for the SONAR sensors.

Once I realized that, the fix was easy. The details aren’t too important, but I stashed the important information from the signal in a shared, global location and exited the interrupt handler. Then the normal, non-interrupt code picked up the stashed information for processing. Outside of the interrupt handler, you can largely take whatever time you want to process data. Sort of. ROS navigation depends heavily on high “frame rates” from sensors. If you can’t generate, say, LIDAR data 10 times a second or faster, you have to slow your robot way down, otherwise the robot would move faster than it could react to, say, the cat rearing up in front of your robot with nunchucks in hand.

The robot is now happily sending sensor data nearly three times faster than before, about 80 frames a second and rarely gets the time out of sync by very much. Now I can move on to the new problems that have shown up because I fixed this problem.

Below is an obligatory snap shot of a part of the visualization software showing my robot, in a room, with LIDAR, SONAR and time of flight sensors showing as well. No cat is in this picture.

Everything about robots is hard (TM)

2023 02 25 Raven Status Update

I’ve spent the last week adding two smart cameras to my robot which currently provide 6 regular camera images and produce two point clouds (the grey structures in the image below). When I first got them to work and turned on the depth computation, I hadn’t charged the battery on the robot in some time and the additional power sucked in for the depth computation caused the battery to die and the whole robot turned off.

The pair of cameras draw an additional 60 watts of power right now and I haven’t even turned on the artificial intelligence computation yet. Well, I don’t think I have. The cameras provide a lot of capability and I’ve had to rewrite the manufacturer’s sample code to get the ability to run two cameras at once on the same robot, and I’m not sure yet of what features I’ve enabled besides depth calculation and stereo vision.

The text at the very bottom of the picture shows that when I run the base robot code and enable the visualizer (rviz2) software, 3/4 of the computer’s capability is being consumed. Enough that it makes it hard to do anything else on the computer, like compose this message while the robot software is running.

Progress is good.

Raven is working under simulation, at least in early days

I have made several attempts over the last few years to get my robot to work under simulation in Gazebo, and I finally have Raven working. Well, it works under Gazebo, but I will still need to work out how to make it work in real life again, but that should be easier. For now, I had to remove the T265 in simulation as it sat in the frame tree as the provider of the odom topic and transformation, which is now done by Gazebo, so I will probably need to make some separate URDFs to be used for simulation vs non simulation.

Thanks go out to Dillo for providing gazebo simulations for the OAK-D cameras I use. I started with his work at:

tenacity_rover.png
tenacity_rover/tenacity_description/urdf at 0945115d723c82f1d72bb4542ab3540b5a3fad10 · jetdillo/tenacity_rovergithub.com

And made changes for it to work under Humble and with more than one camera on a robot. I can provide feedback back to Steve if he wants to consider my changes, which are not very big.

Thanks also to the folks at Articulated Robotics, who have produced some of the best videos I’ve seen on using Ros 2, and I’ve seen an awful lot of videos on Ros 2. This video helped me to finally get the small details right for simulation:

maxresdefault.jpg
Simulating Robots with Gazebo and ROS | Getting Ready to Build Robots with ROS #8youtu.be

And here is a picture of my robot in a small world, showing in Rviz2 and in Gazebo:

328625631_948367356615847_2009750787826921280_n.jpeg

udev rules for multiply Teensy 4.1 devices

My latest robot, Raven, now has two custom PC boards in it, each containing a Teensy 4.1 processor. One board handles all the proximity sensors and does a few other chores, the other manages the touch screen and senses a few other values, such as the battery voltage, and does some other control functions. As with any USB device plugged into Linux, however, the name that gets assigned to the device under the /dev directory can change every time you power up the Linux computer or merely unplug and then plug the USB device back in. Fortunately, the manufacture of the Teensy 4.1 devices supplies a unique serial number to each device, and I use that attribute value to create custom udev rules that create new symbolic names under the /dev directory that are constant. One name, /dev/teensy_sensor will always point to the correct /dev/ttyUSBX device (where X is some digit), no matter how Linux renames it when it discovers the device. And /dev/teensy_screen will always point to the other device. Any script or code I create that wants a device name for communicating with the Teensy 4.1 device can use the appropriate name I created via a custom udev rule.

The default udev rule file supplied by the manufacture of the Teensy 4.1 device does not provide any static /dev name if you plug in multiple devices. The default usb device names of, e.g., /dev/ttyACM0, /dev/ttyACM1, etc. will point to different Teensy devices every time you power up or unplug then plug in your Teensy device. What you want is a symlink providing a name of your choosing that reliably points to the correct Teensy device.

If you use, e.g., the lsusb command, you might see something like the following, where lines that don’t refer to the Teensy devices are replaced with ellipses:

...
Bus 001 Device 108: ID 16c0:0483 Van Ooijen Technische Informatica Teensyduino Serial
,,,
Bus 001 Device 111: ID 16c0:0483 Van Ooijen Technische Informatica Teensyduino Serial
...

You could then look at more details of a specific device with a command such as “lsusb -v -s 001:108“, where “-v” asks for verbose output and the pair of colon-separated numbers after “-s” specify the bus and device of the specific USB device you want to inspect. The beginning of the resulting output might look something like:

Bus 001 Device 108: ID 16c0:0483 Van Ooijen Technische Informatica Teensyduino Serial
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          239 Miscellaneous Device
  bDeviceSubClass         2 
  bDeviceProtocol         1 Interface Association
  bMaxPacketSize0        64
  idVendor           0x16c0 Van Ooijen Technische Informatica
  idProduct          0x0483 Teensyduino Serial
  bcdDevice            2.80
  iManufacturer           1 Teensyduino
  iProduct                2 USB Serial
  iSerial                 3 8771250
...

Where the trailing lines are replaced in the above with an ellipsis.

To create a udev rule specific one Teensy 4.1 device, you need to add a specifier to the rule that qualifies in the iSerial value. In the default udev file provided by the manufacture, namely the file at /etc/udev/rules.d/00-teensy.rules, you will see a line like:

ATTRS{idVendor}=="16c0", ATTRS{idProduct}=="04[789a]*", ENV{MTP_NO_PROBE}="1"

Replace that line with multiple lines that include a qualifier on the serial number, and adding in a SYMLINK action. Also add the KERNEL qualifier at the beginning of the rule, else when you reprogram the Teensy, it will link to a different class of device. For example:

KERNEL=="ttyACM*", ATTRS{idVendor}=="16c0", ATTRS{idProduct}=="04[789a]*", ATTRS{serial}=="8771250", ENV{MTP_NO_PROBE}="1",  SYMLINK+="teensy_sensor"
KERNEL=="ttyACM*", ATTRS{idVendor}=="16c0", ATTRS{idProduct}=="04[789a]*", ATTRS{serial}=="12135850", ENV{MTP_NO_PROBE}="1",  SYMLINK+="teensy_screen"

With the above, my Teensy 4.1 device with an iSerial value of 8771250 will have a symlink named /dev/teensy_sensor that points to the correct /dev/ttyACMX device (where X is a digit), no matter which X value Linux assigns to the device when it is plugged in. Likewise, my other Teensy 4.1 device, with a different serial number, will always get a symlink named /dev/teensy_screen that I can use as a device name in any code or script that would normally want a device name like, e.g., /dev/ttyACM0.

Note that even though the attribute is named iSerial when shown by lsusb, the qualifier name in the udev rule must use serial as the name.

Visualizing the noise in proximity sensors

My time-of-flight proximity sensors of my Raven (formerly Puck) robot use the VL53L0X chips and are fairly noisy. I wanted to have a quick way to visualize just how noisy they were. Here is the graph I produced:

Histograms for 8 time-of-flight sensors

The two rows represent the front versus back of the robot frame. The four columns represent the two pairs of sensors for the left and right sides of the robot frame. At each corner of the robot frame, there is a time-of-flight sensor that is pointed outward to the side (left or right) and another that is pointed either forward or backward. Each histogram is showing normalized counts per distance over the last sampling period.

As you can see from the histograms, some of the sensors are fairly noisy. This is with the robot not moving and nothing nearby is moving. Readings at 0 mm should be ignored as I moved all readings that were beyond 2 meters into the 0 meter bucket as they are uninteresting. The histogram in the third column of the first row shows a fairly wide variance in readings, while the histogram in the last column of the last row show a pretty narrow variance.

Below is the python code used to generate the plots. It includes code to read the ROS 2 messages generated by my custom monitor computer and uses matplotlib to do the display.

from datetime import datetime
import rclpy
from rclpy.node import Node
from rclpy.qos import QoSProfile, QoSReliabilityPolicy, QoSHistoryPolicy

import matplotlib.pyplot as plt
import matplotlib.animation as animation
import numpy as np
from sensor_msgs.msg._range import Range
from std_msgs.msg import String

class MinimalSubscriber(Node):

    def __init__(self):
        super().__init__('minimal_subscriber')
        self.start_time = datetime.now()
        self.callback_count = 0

        # These are the title strings shown per sensor histogram.
        # They are a shortcut, for me, as to the physical position of the corresponding sensor.
        # The 'X" shows where the sensor is located. 
        # The '<' and '>' show whether the sensor is on the left or right side of the robot frame.
        # The '^' and 'v' show whether the sensor is on the front or rear of the robot frame.
        self.sensor_names = [['X<^-', '-<^X', 'X^>-', '-^>X'],
                             ['X<v-', '-<vX', 'Xv>-', '-v>X']]
        self.number_sensors = 8 # There are 8 sensors.
        self.number_values_to_cache = 20 # I want to show the variance over this number of the last readings.
        self.last_n_values_per_sensor = np.zeros(
            (self.number_sensors, self.number_values_to_cache), dtype='float')
        self.next_index_number = np.zeros((self.number_sensors), dtype='int32')

        # Create an array of histograms.
        # Two rows for front vs back of robot.
        # Four columns for left-sideways, left-front-or-back, right-front-or-back, right-sideways position.
        self.figure, self.axis = plt.subplots(
            nrows=2, ncols=4, sharex=False, sharey=False, squeeze=False, figsize=(8, 2))
        
        # Set the window title.
        self.figure.canvas.set_window_title('Time Of Flight Sensors step')

        # Create the x-axis values. I'm interested in only ranges from 0.00 to 1.00 meters.
        self.bins = [x / 100.0 for x in range(100)]

        # Make it all look pretty.
        plt.subplots_adjust(hspace=0.6)
        plt.autoscale(enable=True, axis='both', tight=True)
        plt.rcParams['lines.linewidth'] = 1

        # Set up the ROS 2 quality of service in order to read the sensor data.
        qos_profile = QoSProfile(
            reliability=QoSReliabilityPolicy.BEST_EFFORT,
            history=QoSHistoryPolicy.KEEP_LAST,
            depth=1
        )
        
        # Subscribe to the sensor topics.
        for sensor_number in range(8):
            self.subscription = self.create_subscription(
                Range,
                '/tof{s}Sensor'.format(s = sensor_number),
                self.listener_callback,
                qos_profile,
            )

        # Set up the 8 histogram formats and titles.
        self.patches = [1, 2, 3, 4, 5, 6, 7, 8]
        for row in range(2):
            for col in range(4):
                n, bins, patches = self.axis[row][col].hist(
                    self.last_n_values_per_sensor[row][col], self.bins, histtype='bar')
                self.patches[(row * 4) + col] = patches
                self.axis[row, col].set_title(
                    self.sensor_names[row][col], fontsize=8, fontfamily='monospace')
        
        # Let's go.
        plt.ion()
        plt.show()
        
        self.subscription  # prevent unused variable warning


    # Process a time-of-flight sensor message of type Range.
    def listener_callback(self, msg):
        self.callback_count = self.callback_count + 1
        sensor_number = int(msg.header.frame_id[-1])    # Get the sensor number.
        range_value = msg.range
        if (range_value > 2.0):
            # If the range is greater than 2 meters, ignore it by setting it to zero.
            range_value = 0
            
        # Capture the last readings of the sensor in a ring buffer.
        self.last_n_values_per_sensor[sensor_number][self.next_index_number[sensor_number]] = range_value

        if (self.callback_count % 24) == 0:
            # Peridically update the plots.
            for s in range(8):
                # For each sensor, create a histogram.
                data = self.last_n_values_per_sensor[s]
                n, _ = np.histogram(data, self.bins, density=True)
                max = n.max()
                for count, rect in zip(n, self.patches[s]):
                    rect.set_height(count / max) # Normalize the height of the rectangle.
            self.figure.canvas.draw()
            self.figure.canvas.flush_events()
            
            # Print out the frames per second of sensor data for all 8 sensors since the last plot update.
            # Divide by 8 if you want to know the frames per second per sensor.
            duration = datetime.now() - self.start_time
            fps = self.callback_count / (duration.seconds + (duration.microseconds / 1000000.0))
            print("callback_count: %d, duration: %f, fps: %3.2f" % (self.callback_count, (duration.seconds + (duration.microseconds / 1000000.0)), fps))

        # Update the ring buffer index.
        self.next_index_number[sensor_number] = self.next_index_number[sensor_number] + 1
        if self.next_index_number[sensor_number] >= self.number_values_to_cache:
            self.next_index_number[sensor_number] = 0

def main(args=None):
    rclpy.init(args=args)

    minimal_subscriber = MinimalSubscriber()

    rclpy.spin(minimal_subscriber)

    minimal_subscriber.destroy_node()
    rclpy.shutdown()


if __name__ == '__main__':
    main()