Open Source: Dobot + RealSense Controller from MWC 2026
We are releasing the full source of the robot-vision demo we built for MWC Barcelona 2026: a web-based controller for the Dobot Magician arm paired with an Intel RealSense depth camera.
The live pick-and-place demo we ran at our MWC Barcelona 2026 booth drew a lot of questions about how it was wired up behind the scenes. Today we are making the whole thing public.
Repository: github.com/rosepetal-ai/MWC-dobot-realsense-Controller
What It Is
A single-container web controller that pairs a Dobot Magician robot arm with an Intel RealSense depth camera mounted on top of the workspace with a cenital (top-down) view. You open a browser, see the live RGB and depth streams, and drive the arm from there: jog, click a pixel to move, or launch an automated pick-and-place sequence.
| Hardware | Role |
|---|---|
| Dobot Magician | 4-axis robot arm with suction end-effector, on USB |
| Intel RealSense | Aligned RGB + depth at 640×480, 30 fps, cenital top-down view of the workspace |
| Overhead camera stand | Articulated arm / tripod that holds the RealSense above the workspace pointing straight down |
What It Does
- Real-time jog control over WebSocket with a fast serial path (no 200 ms pydobot stalls).
- Click-to-move: click any pixel in the live top-down image; the controller reads the depth at that pixel and drives the arm there with a safe rise-first trajectory.
- Automated “clean mode”: queue multiple targets and a drop point, then run the full pick-and-place loop autonomously.
- Optional NATS integration for inter-service messaging (camera frames via shared memory, click/drop/suck over request-reply).
Plugged Into Our Own Stack
At the booth we did not run this controller standalone: we drove it from our own Rosepetal platform over the same NATS bus it exposes, so the arm became just another node in the platform.
On top of that we wired in Gemini Robotics-ER 1.5 as the reasoning layer. The model looked at the top-down RealSense stream, picked a target in pixel space, and handed it back to the controller. That was enough to go from scripted pick-and-place to something interactive: we even had the arm playing tic-tac-toe against visitors on the booth table, with Gemini choosing the next cell and the controller turning it into a physical move.
The repository is a compact stack: a FastAPI backend and a single self-contained HTML control panel. Take it, fork it, rewire it to your own hardware.