ROS Package bob_llm
The bob_llm package provides a ROS 2 node (llm node) that acts as a powerful interface to an external Large Language Model (LLM). It operates as a stateful service that maintains a conversation, connects to any OpenAI-compatible API, and features a robust tool execution system.
Features
OpenAI-Compatible: Connects to any LLM backend that exposes an OpenAI-compatible API endpoint (e.g.,
Ollama,vLLM,llama-cpp-python, commercial APIs).Stateful Conversation: Maintains chat history to provide conversational context to the LLM.
Dynamic Tool System: Dynamically loads Python functions from user-provided files and makes them available to the LLM. The LLM can request to call these functions to perform actions or gather information.
Streaming Support: Can stream the LLM’s final response token-by-token for real-time feedback.
Fully Parameterized: All configuration, from API endpoints to LLM generation parameters, is handled through a single ROS parameters file.
Multi-modality: Supports multimodal input (e.g., images) via JSON prompts.
Lightweight: The node is simple and has minimal dependencies, requiring only a few standard Python libraries (
requests,PyYAML) on top of ROS 2.
Installation
Clone the Repository Navigate to your ROS 2 workspace’s
srcdirectory and clone the repository:cd ~/ros2_ws/src git clone https://github.com/bob-ros2/bob_llm.git
Install Dependencies The node requires a few Python packages. It is recommended to install these within a virtual environment.
pip install requests PyYAML
The required ROS 2 dependencies (
rclpy,std_msgs) will be resolved by the build system.Build the Workspace Navigate to the root of your workspace and build the package:
cd ~/ros2_ws colcon build --packages-select bob_llm
Source the Workspace Before running the node, remember to source your workspace’s setup file:
source install/setup.bash
Usage
1. Run the Node
Before running, ensure your LLM server is active and the api_url in your params file is correct.
# Make sure your workspace is sourced
# source install/setup.bash
# Run the node with your parameters file
ros2 run bob_llm llm --ros-args --params-file /path/to/your/ros2_ws/src/bob_llm/config/node_params.yaml
2. Interact with the Node
The package includes a convenient helper script, scripts/query.sh, for interacting with the node directly from the command line.
Once the llm node is running, open a new terminal (with the workspace sourced) and run the script:
$ ros2 run bob_llm query.sh
--- Listening for results on llm_response ---
--- Enter your prompt below (Press Ctrl+C to exit) ---
> What is the status of the robot?
Robot status: Battery is at 85%. All systems are nominal. Currently idle.
>
3. Advanced Input & Multi-modality
The node supports advanced input formats beyond simple text strings. If the input message on /llm_prompt is a valid JSON string, it is parsed and treated as a message object.
Generic JSON Input:
You can pass any valid JSON dictionary. If it contains a role field (e.g., user), it is treated as a standard message object and appended to the history. This allows you to send custom content structures supported by your specific LLM backend (e.g., complex multimodal inputs, custom fields).
Image Handling Helper:
For convenience, the node includes a helper for handling images. If process_image_urls is set to true, the node looks for an image_url field in your JSON input. It will automatically fetch the image (from file:// or http:// URLs), base64 encode it, and format the message according to the OpenAI Vision API specification.
Example (Image Helper):
ros2 topic pub /llm_prompt std_msgs/msg/String "data: '{\"role\": \"user\", \"content\": \"Describe this image\", \"image_url\": \"file:///path/to/image.jpg\"}'" -1
Conversation Flow
A user publishes a prompt to the
/llm_prompttopic.The
llm nodeadds the prompt to its internal chat history.The node sends the history and a list of available tools to the LLM backend.
The LLM decides whether to respond directly or use a tool.
If Tool: The LLM returns a request to call a specific function. The
llm nodeexecutes the function, appends the result to the history, and sends the updated history back to the LLM. This loop can repeat multiple times.If Text: The LLM generates a final, natural language response.
The
llm nodepublishes the final response. If streaming is enabled, it’s sent token-by-token to/llm_streamand the full message is sent to/llm_responseupon completion. Otherwise, the full response is sent directly to/llm_response.
ROS 2 API
The node uses the following topics for communication:
Topic |
Type |
Description |
|---|---|---|
|
|
(Subscribed) Receives user prompts to be processed by the LLM. |
|
|
(Published) Publishes the final, complete response from the LLM. |
|
|
(Published) Publishes token-by-token chunks of the LLM’s response if streaming is enabled. |
|
|
(Published) Publishes the latest user/assistant turn as a JSON string: |
Configuration
The node is configured entirely through a ROS parameters YAML file (e.g., config/node_params.yaml).
ROS Parameters
All parameters can be set via a YAML file, command-line arguments, or environment variables. The order of precedence is: command-line arguments > parameters file > environment variables > coded defaults. For array-type parameters, environment variables should be comma-separated strings (e.g., LLM_STOP="stop1,stop2").
Parameter |
Type |
Default |
Environment Variable |
Description |
|---|---|---|---|---|
|
string |
|
|
The type of the LLM backend API. Currently only |
|
string |
|
|
The base URL of the LLM backend. The node appends |
|
string |
|
|
The API key (Bearer token) for authentication with the LLM backend, if required. |
|
string |
|
|
The specific model name to use (e.g., “gpt-4”, “llama3”). |
|
double |
|
|
Timeout in seconds for API requests to the LLM backend. |
|
string |
|
|
The system prompt to set the LLM’s context, personality, and instructions. |
|
string |
|
|
A JSON string of initial messages for few-shot prompting to guide the LLM. |
|
integer |
|
|
Maximum number of user/assistant conversational turns to keep in history. |
|
string |
|
|
If set to a file path, appends each conversational turn to a persistent JSON log file. |
|
bool |
|
|
Enable or disable streaming for the final LLM response. |
|
bool |
|
|
If true, processes |
|
integer |
|
|
Maximum number of consecutive tool calls before aborting to prevent loops. |
|
double |
|
|
Controls the randomness of the output. Lower is more deterministic. |
|
double |
|
|
Nucleus sampling. Controls output diversity. Alter this or temperature, not both. |
|
integer |
|
|
Maximum number of tokens to generate. |
|
string array |
|
|
A list of sequences where the API will stop generating further tokens. Also, sending any of these strings to |
|
double |
|
|
Penalizes new tokens based on whether they appear in the text so far. |
|
double |
|
|
Penalizes new tokens based on their existing frequency in the text so far. |
|
string array |
Path to |
|
A list of absolute paths to Python files containing tool functions. The node will attempt to load functions from each file path provided. |
Conversation Logging
The node can optionally save the entire conversation to a JSON file, which is useful for debugging, analysis, or creating datasets for fine-tuning models.
To enable logging, set the message_log parameter to an absolute file path (e.g., /home/user/conversation.json). The node will append each user prompt and the corresponding assistant response to this file. If the file does not exist, it will be created. On the first write to a new file, the system_prompt (if configured) will be automatically added as the first entry.
The resulting file will be a flat JSON array of message objects, like this:
[
{
"role": "system",
"content": "You are a helpful robot assistant."
},
{
"role": "user",
"content": "What is the status of the robot?"
},
{
"role": "assistant",
"content": "Robot status: Battery is at 85%. All systems are nominal. Currently idle."
}
]
Tool System
The standout feature of this node is its ability to use dynamically loaded Python functions as tools. The LLM can request to call these functions to perform actions or gather information.
Creating a Tool File
A tool file is a standard Python script containing one or more functions. The system automatically generates the necessary API schema for the LLM from your function’s signature (including type hints) and its docstring. The first line of the docstring is used as the function’s description for the LLM.
Example: config/example_interface.py
def get_weather(location: str, unit: str = "celsius") -> str:
"""
Get the current weather in a given location.
This is an example function and will return a fixed string.
"""
if "tokyo" in location.lower():
return f"The weather in Tokyo is 10 degrees {unit} and sunny."
elif "san francisco" in location.lower():
return f"The weather in San Francisco is 15 degrees {unit} and foggy."
else:
return f"Sorry, I don't have the weather for {location}."
def get_robot_status() -> str:
"""
Retrieves the current status of the robot.
This function checks the robot's battery level, joint states, and current task.
"""
# In a real scenario, this would query robot topics or services
return "Robot status: Battery is at 85%. All systems are nominal. Currently idle."
Configuring Tools
To make your tools available to the LLM, you must provide a list of absolute paths to your Python tool files in the tool_interfaces parameter.
Open config/node_params.yaml and edit the tool_interfaces list. You must replace any placeholder path with the full, absolute path to the tool file on your system.
For example:
# In config/node_params.yaml
llm:
ros__parameters:
# ... other parameters
# A list of Python modules to load as tool interfaces.
# Replace the path below with the absolute path on your machine.
tool_interfaces:
- "/home/user/ros2_ws/src/bob_llm/config/example_interface.py"
# You can add more tool files here
# - "/home/user/ros2_ws/src/my_robot_tools/my_robot_tools/tools.py"
# ... other parameters
Inbuilt Tools
The package comes with several ready-to-use tool modules in the config/ directory.
1. ROS CLI Tools (config/ros_cli_tools.py)
This module provides a comprehensive set of tools that wrap standard ROS 2 command-line interface (CLI) functionalities. It allows the LLM to inspect the system (list nodes, topics, services) and interact with it (publish messages, call services, get/set parameters).
Dependencies:
None (uses standard ROS 2 libraries and CLI tools).
Usage:
Add the absolute path to config/ros_cli_tools.py to your tool_interfaces parameter.
2. Qdrant Memory Tools (config/qdrant_tools.py)
This module enables long-term memory for the LLM using the Qdrant vector database. It uses the Model Context Protocol (MCP) to communicate with a Qdrant MCP server.
Features:
save_memory: Stores information with optional metadata.search_memory: Semantically searches for relevant information in the database.
Prerequisites:
Qdrant Server: You must have a running Qdrant server instance. See Qdrant Quickstart for installation instructions.
MCP Python Package: Install the
mcplibrary:pip install mcp
Qdrant MCP Server: Install the Qdrant MCP server (see mcp-server-qdrant). Ensure
mcp-server-qdrantis in your PATH.
Environment Variables:
The qdrant_tools.py module requires the following environment variables to connect to your Qdrant instance:
export QDRANT_URL="http://localhost:6333"
export QDRANT_API_KEY="your_key"
export COLLECTION_NAME="my_knowledge_base"
Usage:
Add the absolute path to config/qdrant_tools.py to your tool_interfaces parameter.
The node will load all specified tool files at startup.