Research Logs | Gyanig Kumar

Feb 28, 2026 · 3 min read

Log 003: Gaze Desynchronization

Gaze Failures

Noticed a significant desynchronization issue when piping gaze vectors through the VLM framework during rapid head movements. The 7-DoF manipulator started lagging by ~400ms — root cause is frame dropping in the explicit cue alignment module.

Next Step: Re-implementing the buffer using a ring structure rather than standard queueing. We lose a bit of fidelity but gain real-time responsiveness.

Log 002: UnitreeGo1 Telemetry Drop

Proprio APreL

During active preference learning tests on the quadruped, the telemetry loop broke after 20 minutes of sustained load. The ROS2 bridge gets saturated publishing proprioceptive joint states at >100Hz alongside visual data.

Lesson learned: Downsample joint state publishing during high-compute VLM tasks — the model doesn't need 100Hz to derive human intent anyway.

Log 001: Initializing the New Architecture

Architecture

Moving away from standalone model scripts into a unified, modular framework. The goal is to swap the underlying foundational model without rewriting the entire intent-recognition pipeline.

Research Logs.

Log 003: Gaze Desynchronization

Log 002: UnitreeGo1 Telemetry Drop

Log 001: Initializing the New Architecture