ROS Package bob_sdlviz

CI – Build & Test Docker – GHCR

A high-performance ROS 2 visualization node based on SDL2. Designed for flexible, headless streaming of markers, video frames, and dynamic text overlays to platforms like Twitch.

Table of Contents


Overview

bob_sdlviz allows you to create complex visualization layouts combining:

  • Marker Layers: Render visualization_msgs/MarkerArray with custom scaling, offsets, and namespace filtering.

  • Terminal Layers: Dynamic text overlays with word-wrapping and auto-expiration.

  • Video Streams: Integration of raw BGRA frame buffers from external sources (e.g., FFmpeg/MPV).

It is specifically optimized for Docker and headless streaming using a dummy video driver.


Installation

Dependencies

Ensure you have the following system libraries installed:

sudo apt update
sudo apt install libsdl2-dev libsdl2-ttf-dev nlohmann-json3-dev

ROS 2 Dependencies

The package requires standard ROS 2 message libraries:

  • rclcpp

  • std_msgs

  • visualization_msgs


Building

Standard colcon build:

cd ~/ros2_ws
colcon build --packages-select bob_sdlviz
source install/setup.bash

Usage

Launching

Run the node directly:

ros2 run bob_sdlviz sdlviz

Configuration via Environment Variables

Many parameters can be mapped from environment variables for easier Docker integration:

Env Variable

Used for Parameter

Default

SDLVIZ_WIDTH

screen_width

854

SDLVIZ_HEIGHT

screen_height

480

SDLVIZ_SHOW_WINDOW

show_window

true

SDLVIZ_STREAM_OUTPUT

stream_output

false

SDLVIZ_STREAM_PATH

stream_path

/tmp/video_pipe

SDLVIZ_CONFIG_PATH

config_file_path

""

SDLVIZ_FONT_PATH

font_path

/usr/share/fonts/...

SDLVIZ_FPS

fps

30.0


ROS 2 API

Parameters

Parameter

Type

Description

screen_width

int

Target screen/video width in pixels. (Env: SDLVIZ_WIDTH)

screen_height

int

Target screen/video height in pixels. (Env: SDLVIZ_HEIGHT)

show_window

bool

Whether to show the local SDL window. (Env: SDLVIZ_SHOW_WINDOW)

stream_output

bool

Enable writing frames to a FIFO pipe. (Env: SDLVIZ_STREAM_OUTPUT)

stream_path

string

Path to the output FIFO pipe for streaming. (Env: SDLVIZ_STREAM_PATH)

config_file_path

string

Path to a JSON file for initial layout. (Env: SDLVIZ_CONFIG_PATH)

font_path

string

Path to the TTF font file. (Env: SDLVIZ_FONT_PATH)

font_size

int

Base font size for terminals. (Env: SDLVIZ_FONT_SIZE)

fps

double

Target rendering and streaming FPS. (Env: SDLVIZ_FPS)

Topics

Subscribed

  • events (std_msgs/msg/String): Primary control topic. Receives JSON arrays to define, update, or remove dynamic layers.

  • Dynamic Topics: Subscribes to topics defined in the JSON configuration (e.g., marker topics or text strings).

Published

  • events_changed (std_msgs/msg/String): Reports the current active configuration of all layers as a JSON array whenever a change occurs (add, remove, update, or automatic expiration). Useful for UI synchronization or external dashboards.


Dynamic Configuration

The node is controlled by sending JSON arrays to the events topic. Each object in the array represents a layer operation.

Layer Identification (id)

Every layer can have an optional id field.

  • Explicit ID: Use this to uniquely identify a layer for later updates or removal.

  • Auto-ID: If id is omitted, the system automatically assigns a sequential ID (id0, id1, id2, …).

Actions

  • add (default): Creates a new layer or updates an existing one if the id already exists.

  • remove: Deletes the layer with the specified id.

Layer Types & Parameters

All layer types support the following common fields:

  • id (string, optional): Unique ID. Auto-generated if omitted.

  • action (string, optional): add (default) or remove.

  • title (string, optional): Display a title bar above the layer content.

  • expire (float, optional): Auto-remove the layer after $N$ seconds. Set to 0 or omit for infinite lifetime.

1. String (Text Terminal)

Renders a rolling text terminal.

  • topic (string, optional): ROS topic for incoming strings.

  • text (string, optional): Static text to display immediately (useful for one-off alerts).

  • area (array): [x, y, width, height].

  • text_color: [R, G, B, A] (Default: [200, 200, 200, 255]).

  • bg_color: [R, G, B, A] (Default: [30, 30, 30, 180]).

  • align (string): left (default), center, or right.

  • line_limit (int): Max number of lines to keep in history.

  • wrap_width (int): Number of characters before wrapping.

  • clear_on_new (bool): Clear terminal when a new message arrives.

  • append_newline (bool): Automatically add \n to messages.

2. Image

Renders a sensor_msgs/msg/Image.

  • topic (string): ROS topic for incoming images.

  • area (array, optional): [x, y, width, height].

    • If omitted: Pos [0, 0], size is original image dimensions.

    • If width > 0 and height == 0: Height is calculated proportionally.

    • If width == 0 and height > 0: Width is calculated proportionally.

    • If both specified: Image is stretched/shrunk to fit.

3. VideoStream (FIFO Input)

Displays raw video buffers from a pipe.

  • topic: Path to the FIFO pipe (e.g., /tmp/overlay_video).

  • area: [x, y, width, height] on screen.

  • source_width / source_height: Dimensions of the raw input frames (Default: 640x480).

3. MarkerLayer (2D Projection)

Projects 3D ROS markers onto a 2D plane.

  • topic: ROS topic for visualization_msgs/MarkerArray.

  • area: [x, y, width, height] (drawing bounds).

  • scale (float): Mapping of ROS meters to pixels (Default: 1000.0).

  • offset_x / offset_y: Fine-tuning of the projection center.

  • exclude_ns (string): Comma-separated list of marker namespaces to hide.

Video Stream Integration

To feed an external video source into sdlviz node:

1. General FIFO streaming

  1. Create a FIFO pipe:

    mkfifo /tmp/overlay_video
    
  2. Feed the pipe with FFmpeg (BGRA format at real-time speed):

    ffmpeg -re -f lavfi -i testsrc=size=854x480:rate=30 -f rawvideo -pix_fmt bgra /tmp/overlay_video
    

2. Streaming a Terminal Window (Linux/X11)

To capture a specific terminal window and pipe it into the Docker container:

  1. Find window geometry: Run xwininfo and click on the target terminal. Note the -geometry line (e.g., 854x480+10+10).

  2. Stream to container:

    # Use the width, height, and offsets from xwininfo.
    # The write_fifo.sh script inside the container creates the FIFO and handles the input.
    ffmpeg -re -f x11grab -video_size 854x480 -i :0.0+10,10 -f rawvideo -pix_fmt bgra - | \
      docker exec -i nexus_streamer /root/ros2_ws/install/bob_sdlviz/lib/bob_sdlviz/write_fifo.sh --path /tmp/overlay_video
    

Premium UI Overlay (Glassmorphism & Markdown)

For a modern “Browser Source” look with real-time Markdown rendering (ideal for LLM streams), use the webvideo node from the bob_av_tools package:

  1. Install dependencies:

    pip install PySide6 numpy
    # In Docker, you might also need: apt-get install -y libxcb-cursor0 libgbm1 libnss3 libasound2
    
  2. Launch the Renderer Node: This node renders an offscreen Chromium instance and pipes frames to /tmp/overlay_video. It can also publish to a ROS topic.

    ros2 run bob_av_tools webvideo --ros-args -p fifo_path:=/tmp/overlay_video -p sub_topic:=/bob/llm_stream
    
  3. Spawn the layer in sdlviz via /events:

    {
      "type": "VideoStream",
      "topic": "/tmp/overlay_video",
      "area": [50, 50, 400, 600],
      "source_width": 854,
      "source_height": 480
    }
    
  4. Feed the stream: Send tokens to the configured topic.

    ros2 topic pub /bob/llm_stream std_msgs/msg/String "{data: '## Hello World\nThis is a **Premium Overlay**!'}" --once
    

[!NOTE] sdlviz expects exactly 4 bytes per pixel (BGRA). Using 3-byte formats (like RGB or BGR) will result in distorted images.

Example JSON

Add an Image Layer with Proportional Scaling:

[
  {
    "id": "cam_view",
    "type": "Image",
    "topic": "/camera/image_raw",
    "area": [20, 20, 320, 0],
    "title": "USB Cam",
    "expire": 0.0
  }
]

Add a Terminal with Title and Static Text:

[
  {
    "id": "alert",
    "type": "String",
    "title": "⚠ CRITICAL ALERT",
    "text": "Engine temperature critical!",
    "area": [227, 20, 400, 120],
    "text_color": [255, 50, 50, 255],
    "bg_color": [0, 0, 0, 200],
    "align": "center",
    "expire": 10.0
  }
]

Add/Update a persistent Scene:

[
  {
    "id": "status_log",
    "type": "String",
    "topic": "/bob/log",
    "area": [10, 10, 400, 200],
    "text_color": [255, 255, 255, 255],
    "bg_color": [0, 0, 0, 150],
    "line_limit": 20,
    "wrap_width": 60
  },
  {
    "id": "scene_markers",
    "type": "MarkerLayer",
    "topic": "/bob/markers",
    "area": [420, 10, 400, 400],
    "scale": 1500.0,
    "exclude_ns": "env,background"
  }
]

Remove a Layer:

[
  {
    "id": "status_log",
    "action": "remove"
  }
]

Audio System

bob_sdlviz implements a robust, non-blocking audio architecture to ensure streams never stall and audio remains synchronized.

Non-Blocking Audio Bridge

The audio_bridge.py script acts as a persistent master source for FFmpeg. It:

  1. Silence Injection: Continuously feeds silent audio frames into a master pipe (/tmp/audio_master_pipe).

  2. Dynamic Mixing: Listens to the user-facing audio pipe (/tmp/audio_pipe) and mixes any incoming audio with the baseline silence.

This ensures that FFmpeg always has an active audio stream, even if you stop or start your external audio source.

Feeding Audio

To inject audio into the running stream from your host machine, use the feed_audio.sh script:

# Feed a music file
./scripts/feed_audio.sh my_music.mp3

# Feed local system audio (PulseAudio)
./scripts/feed_audio.sh default

Configuration

Env Variable

Description

ENABLE_AUDIO

fifo (Use the bridge), pulse (Direct host audio), or false.

SDLVIZ_AUDIO_PATH

The user-facing pipe to write audio into (/tmp/audio_pipe).

SDLVIZ_AUDIO_MASTER_PATH

The internal pipe used by the bridge (/tmp/audio_master_pipe).


Docker Deployment

bob_sdlviz is designed to be fully containerized, allowing for consistent streaming environments without host-side dependencies (like X-Server).

.env Configuration

Copy the provided template to create your own .env file:

cp .env.template .env

Ensure you set your TWITCH_STREAM_KEY in this file. This key is used by start_stream.sh to authenticate with Twitch.

Dockerfile

The included Dockerfile builds a complete ROS 2 Humble environment including:

  • SDL2 & TTF: For high-quality text and marker rendering.

  • FFmpeg: For RTMP streaming.

  • Python Bridge: For silence injection and audio mixing.

Docker Compose

Use Docker Compose for the easiest deployment. It handles volume mounting for the Unix pipes and sets the necessary environment variables.

To start the streamer:

docker compose up --build

Note: The container runs with network_mode: host to allow seamless discovery of ROS topics on your local network.

Pre-built Image (GHCR)

A pre-built image is automatically published to the GitHub Container Registry on every push to main:

docker pull ghcr.io/bob-ros2/bob-sdlviz:latest

The image is also tagged with the short Git commit SHA (sha-xxxxxxx) for reproducible deployments.


Streaming & Headless Operation

To run in a container or on a remote server without a GPU or X-Server, follow these critical steps:

1. Enable Dummy Driver

Set the environment variable SDL_VIDEODRIVER=dummy. This tells SDL to perform all rendering in software memory rather than attempting to open a graphical window.

2. Configure for Pipe Output

Ensure the following parameters are set (either in .env or as ROS parameters):

  • show_window: false

  • stream_output: true

  • stream_path: /tmp/video_pipe (Matches the volume mount in Docker Compose)

3. FFMPEG Orchestration

The primary entrypoint in the container (start_stream.sh) automatically coordinates:

  • Starting the sdlviz node. All arguments passed to start_stream.sh are forwarded directly to the node.

    • Example (Namespace): ./scripts/start_stream.sh --ros-args -r __ns:=/my_ns

    • Example (Params): ./scripts/start_stream.sh --ros-args --params-file my_config.yaml

  • Starting the audio_bridge.py to provide a baseline silent audio stream.

  • Launching ffmpeg to combine the video pipe and audio pipe into a single FLV stream sent to Twitch.

  • INGEST_SERVER Configuration: You can change the Twitch ingest server (e.g., for different regions) by setting the INGEST_SERVER environment variable (Default: Frankfurt, DE).

  • Headless Mode Enforcement: The container defaults to SDL_VIDEODRIVER=dummy, ensuring that no physical or virtual display is required.

Manual FFMPEG Command Example: If you wish to run FFMPEG manually (e.g., outside of Docker), use this optimized configuration for low-latency streaming:

ffmpeg -f rawvideo -pixel_format bgra -video_size 854x480 -framerate 30 -i /tmp/video_pipe \
       -f s16le -ar 44100 -ac 2 -i /tmp/audio_pipe \
       -vcodec libx264 -preset ultrafast -tune zerolatency -pix_fmt yuv420p \
       -g 60 -b:v 3000k -maxrate 3000k -bufsize 6000k \
       -acodec aac -ab 128k -f flv "${INGEST_SERVER}${TWITCH_STREAM_KEY}"