Skip to content
LinkState
Go back

Designing a large-community schema that survives growth

Introduction toTaxonomy Design

Overview of Taxonomy Requirements

A large‑scale service‑provider network must maintain a single source of truth for three orthogonal concerns that drive operational automation: 1. Regional Preference – how traffic should be steered based on geography, latency, peering economics, and regulatory constraints.
2. Backup Intent – declarative statements about which prefixes, services, or state must be replicated, retained, or archived, together with retention‑policy metadata.
3. Maintenance Drains – scheduled or event‑driven reductions of forwarding capacity (e.g., graceful shutdown, traffic shift) that allow safe hardware/software upgrades without service impact. The taxonomy must be extensible to accommodate:

If the taxonomy collapses under any of these dimensions, operators lose the ability to automate safely, leading to manual interventions, configuration drift, and increased MTTR.

Key Considerations for Scalability and Flexibility

DimensionDesign PressureRecommended ApproachTrade‑off
Cardinality10⁴–10⁵ POPs, 10³ transit contracts, 10² vendor familiesStore taxonomy as hierarchical, version‑controlled data (e.g., Git‑backed YAML/JSON) with schema validation (JSON‑Schema or OpenAPI).Slight overhead for schema validation; mitigated by CI linting.
Change VelocityPOPs added weekly; contracts renegotiated monthly; policy updates dailyImmutable commits + pull‑request (PR) gating with automated test‑rendering pipeline.Requires disciplined GitOps; rollback is a git revert.
Multi‑Vendor RenderingSame logical intent must produce syntactically correct configs for ≥3 vendorsPolicy‑as‑code layer: intent → intermediate representation (IR) → vendor‑specific Jinja2/Terraform templates.Template maintenance burden; mitigated by shared library of Jinja macros.
ObservabilityNeed to detect mismatches between intent and rendered configExport render‑diff artifacts to a time‑series DB (e.g., Prometheus) and alert on non‑zero diff.Adds pipeline step; negligible latency (<2 s per POP).
SafetyPrevent accidental traffic loss during maintenance drainsBlast‑radius tags attached to each drain intent; enforcement via OPA policies that reject drains exceeding a threshold (e.g., >5 % of total egress capacity).OPA evaluation adds ~10 ms per intent; acceptable.

Regional Preference Encoding

Hierarchical Structure for Regional Preferences

Regional preference is expressed as a tree where each node corresponds to a geographic aggregation point (continent → country → metro → POP). Leaf nodes hold the actual preference values (e.g., local‑pref, MED, weight).

Continent └─ Country
     ├─ Metro
     │   ├─ POP-A
     │   │   ├─ preference: {local_pref: 150, med: 10}
     │   │   └─ tags: ["low_latency", "eu‑gdpr"]
     │   └─ POP-B     │       ├─ preference: {local_pref: 120, med: 20}
     │       └─ tags: ["cost_optimized"]
     └─ Rural-Area
         └─ POP-C
             ├─ preference: {local_pref: 80, med: 100}
             └─ tags: ["backup_only"]

Each node may inherit preferences from its parent unless overridden. Inheritance is explicit (via inherit: true) to avoid accidental shadowing.

Attribute‑Based Encoding for Regional Variations

Beyond the hierarchy, we attach key‑value attributes that capture non‑hierarchical factors (e.g., regulatory regimes, peering type, SLA class). Attributes are stored as a flat map at each node and are consulted during policy rendering.

AttributeTypeExample ValuesUsage
regimeenumgdpr, ccpa, noneInfluences data‑retention logic.
peering_typeenumtransit, private, ixpAffects MED/local‑pref calculation.
sla_classstringplatinum, gold, silverMaps to queue‑profile selection.
capacity_mbpsinteger10000Used for drain‑blast‑radius checks.
maintenance_windowcron string"0 2 * * SAT"Default window for POPs lacking explicit drain schedule.

Attributes are typed via JSON‑Schema, enabling validation at commit time.

Examples of Regional Preference Encoding in Practice

File layout (Git‑ops repo): ``` /taxonomy/regional/ ├─ continent/ │ ├─ na.yaml │ ├─ eu.yaml │ └─ apac.yaml ├─ country/ │ ├─ us.yaml │ ├─ de.yaml │ └─ jp.yaml ├─ metro/ │ ├─ nyc.yaml │ ├─ frankfurt.yaml │ └─ tokyo.yaml └─ pop/ ├─ nyc01.yaml ├─ nyc02.yaml ├─ de01.yaml └─ jp01.yaml


**Sample `nyc01.yaml`:**  

```yaml
# nyc01.yaml
parent: metro/nyc
inherit: true
preference:
  local_pref: 180   # higher than default to favor local exit
  med: 5attributes:
  regime: none
  peering_type: transit
  sla_class: platinum  capacity_mbps: 40000
  maintenance_window: "0 3 * * SUN"
tags:
  - "high_capacity"
  - "primary_exit"

Rendering snippet (Jinja2) for Juniper Junos:

{% set pref = regional_data['preference'] %}
set policy-options policy-statement REGIONAL_PREF term 1 from protocol bgp
set policy-options policy-statement REGIONAL_PREF term 1 then local-pref {{ pref.local_pref }}
set policy-options policy-statement REGIONAL_PREF term 1 then med {{ pref.med }}

If inherit: false is set, the node ignores all parent values and relies solely on its own block—useful for overriding a continent‑wide default for a specific POP.


Backup Intent and Data Management

Backup Intent Encoding and Data Retention

Backup intent is modeled as a declarative policy attached to any taxonomy node (region, POP, or service). It specifies:


**Example intent for a POP’s routing table:**  

```yaml
# backup/nyc01-routing.yaml
target: s3://backup-nyc/routing/
scope:
  prefix_list: ["10.0.0.0/8", "192.168.0.0/16"]
retention:
  duration_days: 365
  max_versions: 12
  legal_hold: false
consistency: snapshot
tags: ["routing", "daily"]

Data Management Strategies for Backup and Archive

  1. Tiered Storage – Recent snapshots (<30 d) go to high‑performance object store (e.g., S3‑Standard); older data transitions to Glacier‑Deep Archive via lifecycle rules.
  2. Immutable Writes – Enable S3 Object Lock (Governance mode) to satisfy legal‑hold requirements; OPA validates that any intent with legal_hold: true targets a bucket with lock enabled.
  3. Deduplication – Backup agent computes SHA‑256 of each chunk; identical chunks across POPs are stored once (reference‑counted).
  4. Verification – Nightly job reads a random 1 % sample, recomputes hash, and compares against stored metadata; mismatches trigger a PagerDuty alert.

CLI Examples for Backup Intent Configuration

Assuming a custom CLI taxonomyctl that interacts with the Git‑ops repo via a lightweight API:

# Validate a new backup intent file against schema
taxonomyctl backup validate --file backup/nyc01-routing.yaml

# Intent diff: show what would change if we merge this intent
taxonomyctl backup diff --file backup/nyc01-routing.yaml --branch main

# Apply intent (creates a PR, runs CI, auto‑merges on success)
taxonomyctl backup apply --file backup/nyc01-routing.yaml --msg "Add daily routing backup for NYC01"

# List all intents affecting a given POP
taxonomyctl backup list --pop nyc01 --format json

Under the hood, taxonomyctl apply runs:

  1. git checkout -b backup/nyc01-routing-<timestamp>
  2. Copies file to appropriate directory (/taxonomy/backup/)
  3. Runs jsonschema -i backup/nyc01-routing.yaml $SCHEMA
  4. Triggers CI pipeline (GitHub Actions) that renders a dry‑run of the backup agent config and pushes an artifact.
  5. If all checks pass, opens a PR; upon approval and merge, the intent becomes effective.

Maintenance Drains and Scheduling

Maintenance Window Scheduling and Resource Allocation

Each POP may declare zero or more maintenance drain intents. A drain intent contains:

Schema excerpt:

{
  "$id": "https://example.com/schemas/drain-intent.json",
  "type": "object",
  "required": ["scope", "schedule", "grace_period_seconds", "blast_radius_percent"],
  "properties": {
    "scope": { "type": "string", "pattern": "^(bgp|isis|ospf|l2vpn)\\s+.+$" },
    "schedule": { "type": "string", "format": "date-time" },
    "grace_period_seconds": { "type": "integer", "minimum": 0 },
    "blast_radius_percent": { "type": "number", "minimum": 0, "maximum": 100 },
    "dependencies": { "type": "array", "items": { "type": "string" } },
    "tags": { "type": "array", "items": { "type": "string" } }
  }
}

Automated Maintenance Drains using Scripts and Tools The preferred automation stack:

Code Examples for Maintenance Drain Implementation

Airflow DAG snippet (Python):

from airflow import DAG
from airflow.providers.http.operators.http import SimpleHttpOperatorfrom airflow.utils.dates import days_agoimport json

def load_drain_intents(**context):
    # Pull from Git via taxonomyctl API
    import subprocess, os
    result = subprocess.run(
        ["taxonomyctl", "drain", "list", "--format", "json"],
        capture_output=True, text=True, check=True
    )
    return json.loads(result.stdout)

with DAG(
    dag_id="maintenance_drain_scheduler",
    schedule_interval=None,  # triggered externally by taxonomy updates
    start_date=days_ago(1),
    catchup=False,
) as dag:

    fetch_intents = SimpleHttpOperator(
        task_id="fetch_intents",
        http_conn_id="taxonomy_api",
        endpoint="/drain/intents",
        method="GET",
        response_filter=lambda r: json.loads(r.text),
    )

    def create_drain_task(intent):
        return SimpleHttpOperator(
            task_id=f"drain_{intent['id']}",
            http_conn_id="vendor_api",
            endpoint="/drain/execute",
            method="POST",
            data=json.dumps(intent),
            headers={"Content-Type": "application/json"},
            # safety check: call a custom endpoint that validates blast radius
            # (omitted for brevity)
        )

    # Dynamically expand tasks based on fetched intents (Airflow 2.3+ supports dynamic task mapping)
    drain_tasks = fetch_intents.output.map(create_drain_task)

Ansible playbook for Juniper Junos drain:

- name: Execute BGP peer drain on Juniper
  hosts: "{{ target_pop }}"
  vars:
    peer: "{{ item.peer }}"
    grace: "{{ item.grace_period_seconds }}"
  tasks:
    - name: Issue graceful shutdown
      junipernetworks.junos.junos_config:
        lines:
          - "set protocols bgp group {{ item.group }} neighbor {{ peer }} shutdown"
        comment: "Maintenance drain per taxonomy intent"
      register: shutdown_result

    - name: Wait for grace period
      pause:
        seconds: "{{ grace }}"

    - name: Verify peer state
      junipernetworks.junos.junos_command:
        commands:
          - "show bgp neighbor {{ peer }}"
      register: bgp_state
      until: "'State: Idle' in bgp_state.stdout"
      retries: 10
      delay: 5

    - name: Record undo command
      # (undo logic would be added here)
      debug:
        msg: "Record undo command for {{ peer }}"

Share this post on:

Previous Post
Migrating from standard to large communities safely
Next Post
Why local-pref beats a shorter AS_PATH