Log 002: UnitreeGo1 Telemetry Drop
During active preference learning (APreL) tests on the UnitreeGo1 quadruped, the telemetry loop broke after approximately 20 minutes of sustained load. The ROS2 bridge gets saturated when publishing proprioceptive joint states at >100Hz concurrently with visual data streams.
The failure mode was subtle: the robot continued to move, but the feedback loop to the preference learning module silently stalled. The human demonstrator kept providing corrections that were never registered — making the entire session useless from a data collection standpoint.
What Happened
ROS2's DDS layer was dropping messages under the combined bandwidth pressure of joint states (100Hz) + compressed image topics (15fps) + the VLM inference telemetry. The bridge's in-memory queue backed up completely after ~20 minutes of warm GPU usage.
Lesson Learned
Downsample joint state publishing during high-compute VLM tasks. The preference model doesn't need proprioceptive feedback at 100Hz — it's predicting high-level task intent, not low-level balance corrections. Dropping to 20Hz during inference reduces bandwidth by 80% with negligible impact on learning quality.
Will implement a dynamic QoS policy that automatically reduces joint state frequency when GPU utilization exceeds 85%.