05 -- Load Balancing

Prerequisite: 04 -- Failover You will need: Running Hangar with a MCP server group from recipe 04 Time: 5 minutes Adds: Distribute requests evenly across multiple MCP server instances

The Problem

You have two MCP servers in a failover group. All traffic goes to the primary -- the backup sits idle. You want to use both MCP servers and spread the load.

The Config

# config.yaml -- Recipe 05: Load Balancing
mcp_servers:
  my-mcp:
    mode: remote
    endpoint: "http://localhost:8080"
    health_check_interval_s: 10          # from recipe 02
    max_consecutive_failures: 3          # from recipe 03

  my-mcp-backup:
    mode: remote
    endpoint: "http://localhost:8081"
    health_check_interval_s: 10
    max_consecutive_failures: 3

  my-mcp-3:                              # NEW: third instance
    mode: remote
    endpoint: "http://localhost:8082"
    health_check_interval_s: 10
    max_consecutive_failures: 3

  my-mcp-group:
    mode: group
    strategy: round_robin                # NEW: changed from priority to round_robin
    min_healthy: 1
    members:
      - id: my-mcp
        weight: 1                        # NEW: equal weight
      - id: my-mcp-backup
        weight: 1                        # NEW: equal weight
      - id: my-mcp-3                     # NEW: third member
        weight: 1

Try It

  1. Start all three MCP server instances on ports 8080, 8081, 8082.

  2. Start Hangar and verify the group:

    mcp-hangar status
    my-mcp-group    group     ready    strategy=round_robin  members=3/3 healthy
    
  3. Make several tool calls through the group and observe distribution in the logs. Use the JSON-RPC approach from recipe 03:

    (
      echo '{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}},"id":1}'
      sleep 0.5
      echo '{"jsonrpc":"2.0","method":"notifications/initialized","params":{}}'
      sleep 0.5
      echo '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"hangar_call","arguments":{"calls":[{"mcp_server":"my-mcp-group","tool":"add","arguments":{"a":1,"b":2}}]}},"id":2}'
      sleep 2
    ) | mcp-hangar serve 2>/dev/null | grep '"id":2'

    Each call routes to a different member in round-robin order.

  4. Stop one instance and observe redistribution:

    # Kill the process on port 8082
    mcp-hangar status
    my-mcp-group    group     partial  strategy=round_robin  members=2/3 healthy
    

    Traffic automatically redistributes to the remaining two healthy members.

What Just Happened

The round_robin strategy cycles through healthy members sequentially. Each request goes to the next member in the rotation. When a member fails health checks, it is removed from the rotation until it recovers.

Other available strategies:

StrategyBehavior
round_robinCycle through members sequentially
randomRandom member selection
least_connectionsRoute to member with fewest active calls
weighted_round_robinRespect weight field -- higher weight gets more traffic
priorityRoute to lowest priority number (primary/backup pattern)

Key Config Reference

KeyTypeDefaultDescription
strategystringround_robinLoad balancing strategy
members[].weightint1Relative routing weight (used by weighted strategy)

What's Next

Your MCP servers are balanced -- but what happens when one client sends 1000 requests per second? You need to protect your MCP servers from overload.

--> 06 -- Rate Limiting