DNS Migration Validation

DNS changes are latent and flaky. Propagation can take minutes to hours depending on TTLs, caching, and the specific DNS resolvers involved. This guide shows you how to validate DNS changes globally before proceeding with the rest of your migration.

The Problem

When you update DNS records:

  • Some resolvers see the change immediately
  • Others cache the old value for the TTL duration
  • Some ISPs ignore TTL and cache longer
  • Different regions may see different results

Proceeding with a migration before DNS has propagated can cause outages for some users.

Solution: Multi-Region DNS Validation

Use Quismon's multi-region DNS checks to verify propagation before proceeding:

Step 1: Create a DNS Check Across All Regions

resource "quismon_check" "dns_propagation" {
  name     = "DNS Propagation - api.example.com"
  type     = "dns"
  config   = jsonencode({
    domain = "api.example.com"
    record_type = "A"
    expected_values = ["192.0.2.100"]  # Your NEW IP
  })
  regions  = [
    "na-east-ewr", "na-west-lax", "na-central-dfw",
    "eu-central-fra", "eu-west-lhr", "eu-east-waw",
    "ap-southeast-sin", "ap-northeast-tyo", "ap-south-mum",
    "sa-east-gru", "af-south-jnb", "au-southeast-syd"
  ]
  interval_seconds = 60
}

Step 2: Wait for Global Propagation

Monitor the check until all regions report success. Use the dashboard or API:

# Check propagation status via API
curl -H "Authorization: Bearer $API_KEY" \
  "https://api.quismon.com/v1/checks/{check_id}/results?limit=12" | \
  jq '[.data[] | {region, success}] | group_by(.success) | map({status: .[0].success, count: length})'

Expected output when fully propagated:

[
  {"status": true, "count": 12}
]

Step 3: Proceed with Migration

Once all regions return the expected value, your DNS change has propagated globally. Safe to:

  • Shut down the old server
  • Update dependent services
  • Announce the migration complete

Pattern: DNS Validation in Multistep Checks

For automated migrations, use DNS validation as an early step in a multistep check:

{
  "name": "Migration Validation Flow",
  "type": "multistep",
  "config": {
    "steps": [
      {
        "name": "wait-for-dns",
        "type": "dns",
        "config": {
          "domain": "api.example.com",
          "record_type": "A",
          "expected_values": ["192.0.2.100"]
        },
        "retry": {
          "count": 30,
          "interval_seconds": 60
        }
      },
      {
        "name": "health-check",
        "type": "https",
        "config": {
          "url": "https://api.example.com/health",
          "expected_status": [200]
        }
      },
      {
        "name": "smoke-test",
        "type": "https",
        "config": {
          "url": "https://api.example.com/api/v1/status",
          "expected_status": [200]
        }
      }
    ]
  },
  "regions": ["na-east-ewr"]
}
Note:
  • Multistep retry is per-step, not global
  • For multi-region DNS validation, create a separate DNS check
  • Use check dependencies to chain: DNS check → Multistep flow

Pattern: Alert on Unexpected DNS Changes

DNS checks can detect unauthorized or accidental changes:

resource "quismon_check" "dns_integrity" {
  name     = "DNS Integrity - Critical Records"
  type     = "dns"
  config   = jsonencode({
    domain = "payment.example.com"
    record_type = "A"
    expected_values = ["192.0.2.50"]  # ONLY this IP is valid
  })
  regions  = ["na-east-ewr", "eu-central-fra"]
  interval_seconds = 300  # Every 5 minutes
}

resource "quismon_alert_rule" "dns_changed" {
  check_id = quismon_check.dns_integrity.id
  name     = "DNS Record Changed"
  condition = {
    dns_changed = true  # Alert when DNS returns different value
  }
  notification_channel_ids = [quismon_notification_channel.pagerduty.id]
}

Best Practices

  • Lower TTLs before changes: Set TTL to 300s (5 min) 24 hours before migration
  • Use multiple regions: Different resolvers = different caches
  • Wait 2x TTL: After changing, wait at least 2x the old TTL
  • Monitor during migration: Keep the DNS check running to catch regressions
  • Keep old server running: Don't shut down until DNS is fully propagated

See Also