Wednesday, May 20, 2026

[Incident Report #034][DNS] Inter‑subnet Communication Failure

What Happened?
On May 15th, our upstream data center completed a router upgrade. As an
unintended side effect, the FurrIX subnets located within the data center
were no longer able to reach one another. Because this issue was isolated to
internal data‑center paths, no external member traffic or internet‑facing name server
traffic were affected.

The issue went undetected until May 19th because our monitoring system and
our email system reside on opposite subnets. With inter‑subnet communication
broken, monitoring alerts could not reach us.

We were seeing the following issues:

  • Internal service reachability — Some internal services were unreachable from
    member connections.
  • NS2 isolation — Members could not reach NS2.
  • Stale zones on NS2 — NS2 could not reach NS1; as a result, its zones
    went stale on May 17th.
  • NMS visibility loss — The Network Management System could not reach devices
    on PHY One for accounting and monitoring.
  • Backup failures — PHY One could not reach the PBS instance on PHY Two,
    preventing nightly backups.

What did we do to fix this?
We provided the data center with test results and trace data confirming the inter‑subnet
routing failure. They corrected the configuration on their side, restoring full communication
between PHY One and PHY Two. All internal services, monitoring and backup operations
have returned to normal.