DevOps & SRE Guide

Operational monitoring patterns for deployment verification, CI/CD integration, and incident response.

Deployment Verification

Create expiring checks to verify deployments across all regions:

# After deploying, create a temporary verification check
curl -X POST https://api.quismon.com/v1/checks \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "deploy-verify-'$VERSION'",
    "type": "https",
    "config": {
      "url": "https://api.example.com/health",
      "expected_status": [200],
      "expected_body": "version:'$VERSION'"
    },
    "regions": ["na-east-ewr", "eu-central-fra", "ap-southeast-sin"],
    "simultaneous_regions": true,
    "interval_seconds": 30,
    "expires_after_seconds": 3600
  }'

The check runs for 1 hour and then automatically deletes itself.

CI/CD Integration

GitHub Actions

# .github/workflows/deploy.yml
- name: Verify Deployment
  run: |
    CHECK_ID=$(curl -s -X POST https://api.quismon.com/v1/checks \
      -H "Authorization: Bearer ${{ secrets.QUISMON_API_KEY }}" \
      -H "Content-Type: application/json" \
      -d '{
        "name": "ci-verify-${{ github.sha }}",
        "type": "https",
        "config": {"url": "${{ env.API_URL }}/health"},
        "regions": ["na-east-ewr"],
        "interval_seconds": 60,
        "expires_after_seconds": 600
      }' | jq -r '.id')

    # Wait for first result
    sleep 60

    # Check result
    RESULT=$(curl -s "https://api.quismon.com/v1/checks/$CHECK_ID/results?limit=1" \
      -H "Authorization: Bearer ${{ secrets.QUISMON_API_KEY }}" \
      | jq -r '.[0].success')

    if [ "$RESULT" != "true" ]; then
      echo "Deployment verification failed"
      exit 1
    fi

Blue/Green Deployment Monitoring

Use resolve_ip to target specific servers behind a load balancer:

{
  "name": "blue-pool-health",
  "type": "https",
  "config": {
    "url": "https://api.example.com/health",
    "resolve_ip": "10.0.1.5",
    "expected_status": [200]
  },
  "regions": ["na-east-ewr"]
}

Incident Response

During incidents, create tighter-interval checks for investigation:

# Create a 10-second interval check during investigation
curl -X POST https://api.quismon.com/v1/checks \
  -d '{
    "name": "incident-INC-123-investigation",
    "type": "https",
    "config": {"url": "https://api.example.com/debug/latency"},
    "regions": ["na-east-ewr"],
    "interval_seconds": 10,
    "expires_after_seconds": 1800
  }'

Review high-resolution data during the incident, then let the check expire.

Key Patterns

  • Expiring checks: CI/CD verification, incident investigation
  • Inverted checks: Verify dev/staging shouldn't be accessible
  • Multi-region simultaneous: Catch regional issues immediately
  • Failure rechecks: Confirm issues before alerting