Run RL Swarm Node

RL Swarm is a fully open-source framework developed by GensynAI for building reinforcement learning (RL) training swarms over the internet. This guide walks you through setting up an RL Swarm node and a web UI dashboard to monitor swarm activity.

Hardware Requirements

CPU: Minimum 16GB RAM (more RAM recommended for larger models or datasets).
GPU (Optional): Supported CUDA devices for enhanced performance:
- RTX 3090
- RTX 4090
- A100
- H100
Note: You can run the node without a GPU using CPU-only mode (details in the docker-compose.yaml section).

Install Dependencies

1. Update System Packages

sudo apt-get update && sudo apt-get upgrade -y

2. Install General Utilities and Tools

sudo apt install curl iptables build-essential git wget lz4 jq make gcc nano automake autoconf tmux htop nvme-cli libgbm1 pkg-config libssl-dev libleveldb-dev tar clang bsdmainutils ncdu unzip libleveldb-dev  -y

3. Install Docker

# Remove old Docker installations
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done

# Add Docker repository
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Test Docker
sudo docker run hello-world

Tip: To run Docker without sudo, add your user to the Docker group:

sudo usermod -aG docker $USER

4. Install Python

sudo apt-get install python3 python3-pip

Clone the Repository

git clone https://github.com/gensyn-ai/rl-swarm/
cd rl-swarm

Create `docker-compose.yaml`

This file defines the services: the RL Swarm node, telemetry collector, and web UI.

Rename the old file:

mv docker-compose.yaml docker-compose.yaml.old

Create the new file:

nano docker-compose.yaml

Paste the following configuration:

version: '3'

services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib:0.120.0
    ports:
      - "4317:4317"  # OTLP gRPC
      - "4318:4318"  # OTLP HTTP
      - "55679:55679"  # Prometheus metrics (optional)
    environment:
      - OTEL_LOG_LEVEL=DEBUG

  swarm_node:
    image: europe-docker.pkg.dev/gensyn-public-b7d9/public/rl-swarm:v0.0.1
    command: ./run_hivemind_docker.sh
    runtime: nvidia  # Enables GPU support; remove if no GPU is available
    environment:
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
      - PEER_MULTI_ADDRS=/ip4/38.101.215.13/tcp/30002/p2p/QmQ2gEXoPJg6iMBSUFWGzAabS2VhnzuS782Y637hGjfsRJ
      - HOST_MULTI_ADDRS=/ip4/0.0.0.0/tcp/38331
    ports:
      - "38331:38331"  # Exposes the swarm node's P2P port
    depends_on:
      - otel-collector

  fastapi:
    build:
      context: .
      dockerfile: Dockerfile.webserver
    environment:
      - OTEL_SERVICE_NAME=rlswarm-fastapi
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
      - INITIAL_PEERS=/ip4/38.101.215.13/tcp/30002/p2p/QmQ2gEXoPJg6iMBSUFWGzAabS2VhnzuS782Y637hGjfsRJ
    ports:
      - "8080:8000"  # Maps port 8080 on the host to 8000 in the container
    depends_on:
      - otel-collector
      - swarm_node
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/api/healthz"]
      interval: 30s
      retries: 3

GPU/CPU Note: If you don't have an NVIDIA GPU or the NVIDIA Container Runtime, remove the runtime: nvidia line under swarm_node to run on CPU.

What Each Service Does:

otel-collector: Gathers telemetry data (metrics, traces).
swarm_node: The core RL Swarm node connecting to the network.
fastapi: The web UI dashboard for monitoring.

Run RL Swarm Node + Web UI Dashboard

Start the services with:

docker compose up --build -d && docker compose logs -f

Note: The first run may take time due to image downloads. Look for this log to confirm your node joined the swarm

Exit Logs: Press Ctrl+C

Check logs

RL Swarm node:

docker-compose logs -f swarm_node

Web UI:

docker-compose logs -f fastapi

Telemetry Collector:

docker-compose logs -f otel-collector

All Logs: Use docker-compose logs -f without a service name.

Access the Web UI Dashboard

VPS: http://<your-vps-ip>:8080/
Local PC: http://localhost:8080 or http://0.0.0.0:8080

Monitoring Your Node

The dashboard displays collective swarm data, not individual node stats. To track your node:

1- Check the swarm_node logs for your node’s unique ID (e.g., [F-d2042cff-01c9-4801-8ea7-1c1afc29c9b6]):

2- Search for this ID in the dashboard data to see your node’s contributions.

Note: The dashboard monitors all peers together. The node ID in the logs is likely your identifier but I am keep experimenting to find out more about the node metrics!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Run RL Swarm Node

Hardware Requirements

Install Dependencies

Clone the Repository

Create `docker-compose.yaml`

What Each Service Does:

Run RL Swarm Node + Web UI Dashboard

Check logs

Access the Web UI Dashboard

Monitoring Your Node

About

Uh oh!

Releases

Packages

License

homan9883/gensyn-ai

Folders and files

Latest commit

History

Repository files navigation

Run RL Swarm Node

Hardware Requirements

Install Dependencies

Clone the Repository

Create docker-compose.yaml

What Each Service Does:

Run RL Swarm Node + Web UI Dashboard

Check logs

Access the Web UI Dashboard

Monitoring Your Node

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Create `docker-compose.yaml`

Packages