Join Pilot

Limited slots available for early access

Guide

Redundant Control Plane Design Guide for Show Networks

How to design and validate redundant control paths for consoles, nodes, and gateways so failover behavior stays deterministic during live operation.

Y-LinkY-LinkFebruary 14, 2026
Redundant Control Plane Design Guide for Show Networks

Redundant Control Plane Design Guide for Show Networks

Design goals and common failure modes

Designing a redundant control plane for a show network requires clear operational goals. The primary objective is deterministic failover: when a control path fails during a show, consoles, nodes, and gateways must move to a known state within defined timing boundaries so FOH operators and stage crew can predict behavior.

Common failure modes in live production include single-link failure, switch or port fault, console crash, power loss at a node rack, gateway software hang, and human error during doors or load-in. Each failure mode should have an expected, documented reaction: for example, a node should hold its last known DMX output for up to a configured timeout rather than immediately blanking the rig when a multicast stream is interrupted.

Topology patterns that support deterministic failover

Preferred topologies for show networks balance redundancy with simplicity. Typical patterns used in production are:

  • Dual-homed core with ring-connected distribution switches for physical redundancy.
  • Star to dual-star hybrid where each node or gateway has two uplinks, one to the FOH/core and one to a stage-side distribution.
  • Independent control VLANs for primary and backup console traffic to avoid contention with media or general-purpose VLANs.

Keep multicast control (sACN/Art-Net) contained to the control VLAN and use IGMP snooping on distribution switches to limit multicast floods. Use simple, static or small routing domains for control traffic to reduce unpredictable convergence from dynamic routing protocols during a show.

Console and backup console strategy

Define exactly how a backup console will take over. Options include hot-takeover (immediate pass-through of control), preemptive takeover with operator confirmation, and passive shadowing. For live FOH, hot-takeover with a preconfigured priority table is common; that means the backup console is always ready and assigned a lower priority until a failure raises it.

Practical settings to achieve determinism:

  • Assign fixed priorities for sACN/Art-Net sources where protocol permits. Ensure the backup console's priority is lower but non-zero.
  • Disable auto-merge or set known merge modes and default values on nodes so merged control produces predictable output.
  • Document and test the exact operator sequence to switch consoles at doors or during FOH changes; rehearsals should include console swaps under show timing constraints.

Node and gateway redundancy behavior

Nodes and DMX/IP gateways are the last-mile devices that must behave predictably when upstream control changes. Configure gateways with explicit hold, failover, and safe-state policies that match your lighting design and FOH expectations.

Recommended configurations:

  • Hold-last-state with a configurable timeout rather than an immediate blackout.
  • Failover priority for multiple control sources, with logs or LED state indicators so technicians can confirm which source is active at a glance.
  • Local merge rules that favor the most recent valid stream or a fixed-priority table. Avoid 'first come' modes unless you have strict control over network timing.

For gateways that interface with dimmers, enable an output ramp to the safe level over a short window to avoid visible flicker or sudden jumps when inputs change.

Link, path management, and physical considerations

Redundancy is only as good as its physical implementation. Use separate cable routes for uplinks to reduce the chance of a single point of failure (e.g., a crushed mic cable under a door or a tripped break in a cable trunk). For FOH and stage paths, route primary and secondary fibers or copper runs on opposite sides of the venue when possible.

When doors and load-in points are involved, establish dedicated protection such as conduit or cable trays. Label both ends of each link with unique identifiers and record them in the venue network diagram so doors crew and stagehands understand which run serves which path.

Addressing, multicast and protocol settings

IP addressing and multicast tuning are core to predictable control-plane behavior. Use static addressing for consoles, gateways and critical nodes where possible and reserve DHCP only for non-critical devices. For multicast protocols:

  • Enable IGMP snooping on every distribution switch that carries sACN or Art-Net.
  • Pin multicast group ranges and document which universe ranges each console will publish and which nodes will subscribe to. Avoid dynamic universe switching during shows.
  • Set consistent source priorities and TTL limits so multicast does not leak into unrelated network segments if a routing change occurs.

Consider protocol-specific features: when using sACN, validate network priority fields; for Art-Net, confirm that node firmware supports your intended merge behavior across dual inputs.

Testing and rehearsal procedures

Validation is operationally focused: every venue layer should be tested under load and under failure conditions during rehearsal. Plan tests that simulate real incidents rather than purely lab conditions.

Suggested live tests to run in rehearsal:

  • Kill one uplink cable at the door and observe node hold times and backup console takeover time. Record the timing and compare to your maximum acceptable outage window.
  • Force a console crash or simulated reboot and verify that backup console moves to the expected priority and that scenes resume or hold as documented.
  • Test gateway power loss by pulling PDU feed in a node rack. Confirm gateway output behavior and how quickly remote engineering can restore or switch feed without disrupting the audience experience.
  • Run a multicast storm test with IGMP snooping to ensure distribution switches limit flooding to adjacent ports and do not drop control packets to subscribed nodes.

Document results and tune timeouts, priorities, and merge modes until behavior is repeatable across multiple runs.

Operational procedures for FOH, stage and doors

Operational procedures translate design into human actions. For FOH and stage crews, create short operational scripts that cover normal takeover, recovery from partial failures, and full failover. These scripts must be concise and rehearsed so they can be executed quickly at doors or FOH positions.

Concretely:

  • At doors: a labeled procedure for swapping primary uplink cable to secondary with a colleague on FOH monitoring DMX or console output. Time the swap and report to FOH within a preset window.
  • For a backup console takeover: an operator checklist that includes verifying node indicators, ensuring the backup console is configured with the correct universe map, and announcing takeover on comms before taking live control.
  • For node or gateway faults: stage electricians should have pre-authorized PDU access and a documented safe power cycle procedure to avoid cascading failures in a rack.

Include clear abort criteria (for example, drop to a known lighting state and stop the show) and ensure stage management understands when to use them.

Monitoring, logging and post-show validation

Continuous monitoring gives you the data to prove determinism. Use logging on switches, consoles and gateways to capture failover events, multicast joins/leaves, and link flaps. Store logs centrally or export key events to the console show log.

After each performance and rehearsal, review logs for unexpected merges, priority preemption, and any deviations from the documented hold times. Include network diagrams and change records in the post-show report so recurring issues can be traced to physical or configuration causes.

Frequently Asked Questions

How quickly should a backup console assume control during a show?

Target a deterministic window based on the production: for music performances a sub-second takeover may be required; for theatre the team may tolerate a 2–5 second window if appropriate hold rules are set. Define your maximum allowable outage and design the priority and takeover method to meet it.

What should nodes do when they lose their multicast stream?

Nodes should be configured to hold the last valid values for a configured timeout and then move to a safe level or a predefined blackout depending on rig requirements. Immediate blanking should be avoided unless safety demands it.

Can I rely on STP or dynamic routing for control traffic?

Avoid relying solely on dynamic protocols for the control plane during a live show; their convergence time can be unpredictable. Use them in the network core for redundancy, but keep control-layer routing simple and predictable with static routes and VLAN separation.

How do I validate physical redundancy at doors and load-in points?

Create a physical test that includes walking the route, documenting cable IDs, and performing a switchover under rehearsal time constraints. Regularly inspect cable protection at doors and verify that primary and secondary paths are physically segregated.

What logging is most useful after an incident?

Capture time-stamped console events, multicast join/leave records, switch port up/down events, and gateway state transitions. Align these with show logs and operator notes so you can correlate human actions with network events.