Skip to content
LinkState
Go back

Migrating from standard to large communities safely

Introduction to Staged Migration

This guide outlines a staged migration from standard BGP communities to Large Communities (RFC 8092). It covers pre‑checks, dual‑tag windows, rollback triggers, and proof points that verify traffic‑engineering (TE) behavior remains stable during the cutover.


Overview of Standard and Large Communities

Benefits of Large Communities


Pre‑Migration Prechecks

Network Assessment and Inventory

  1. Device‑level inventory – Export BGP speakers with OS version, feature set, and current community‑related configuration.
    # Juniper Junos example
    show version | match "Junos"
    show configuration protocols bgp | display set | match community
  2. Feature support verification – Confirm each device understands the Large Community attribute (type 32) and can send/receive it.
    • Cisco IOS‑XR: show bgp attributes includes large-community if supported.
    • Juniper Junos: show bgp neighbor <ip> received-routes | match large-community.
    • Arista EOS: show ip bgp neighbors <ip> received-routes | include large-community.
      Any device lacking support is a hard stop for migration on that node.
  3. Topology mapping – Identify iBGP full‑mesh, route‑reflector clusters, and eBGP peers where communities are exchanged. Document the blast radius of a mis‑configured community (the set of peers that receive the updated attribute).

Community Configuration Review

Traffic Engineering Analysis


Migration Planning and Design

Dual‑Tag Windows Strategy

A dual‑tag window is a period during which a prefix carries both the original standard community and the new large community, allowing receivers to continue using the old attribute while validating the new one.

  1. Selection of pilot prefixes – Choose a small, representative set (≤ 5 % of TE‑affected prefixes) that:
    • Are not involved in critical services (no voice or real‑time traffic).
    • Have stable BGP adjacency (no flaps in the last 48 h).
    • Appear in multiple geographic POPs to test cross‑region behavior.
  2. Attribute addition – On the advertising router, configure the route‑map to set both attributes:
    ! Cisco IOS‑XR example
    route-map SET_LARGE_COMM permit 10
      set community 65000:200 additive
      set large-community 1:200:10 additive
    !
    neighbor 10.0.0.1 route-map SET_LARGE_COMM out
  3. Receiver‑side validation – Ensure peers are configured to accept both attribute types (default behavior) and optionally log receipt of large communities:
    ! Juniper Junos
    protocols {
        bgp {
            group TE-PEERS {
                neighbor 10.0.0.2 {
                    import [ TE-IMPORT ];
                    log-updown;
                }
            }
        }
    }
    policy-options {
        policy-statement TE-IMPORT {
            term large-comm {
                from {
                    community [ TE-STD ];
                    large-community [ TE-LARGE ];
                }
                then accept;
            }
        }
    }
  4. Window duration – Define a minimum observable period (e.g., 2 × MED advertisement interval + 2 × BGP hold time). Typical values: 10‑15 minutes in a stable IBGP fabric.

Rollback Triggers and Procedures

Trigger conditions (observable, not speculative):

  1. Loss of TE intent – Deviation > 5 % in selected KPI (average path latency, jitter, or packet loss) for any pilot prefix during the dual‑tag window.
  2. Attribute corruption – Detection of malformed large‑community values (exceeding 4‑byte limits) in received updates.
  3. Control‑plane instability – BGP session flaps > 3 times per peer within the window, or update‑queue depth > 80 % of configured limit.
  4. Operator‑initiated abort – Manual command from the change‑management system (e.g., change abort in an automation orchestrator).

Rollback procedure (operational, not automatic):

  1. Cease advertising large community – Remove the set large-community clause from the route‑map, leaving only the standard community.
  2. Withdraw dual‑tagged updates – Issue a soft outbound refresh (clear ip bgp <neighbor> soft out or equivalent) to resend updates with only the standard community.
  3. Verify revert – Confirm peers no longer see the large‑community attribute (show bgp neighbor <ip> received-routes | exclude large-community).
  4. Notify stakeholders – Log the rollback event with timestamp, trigger condition, and affected prefixes.

Proof Points for Traffic Engineering Stability

For each pilot prefix, collect evidence before, during, and after the dual‑tag window: - Path selectionshow bgp <prefix> (next‑hop, local‑pref, MED, IGP metric).

A success proof point is defined as: All KPIs remain within ± 2 % of baseline and the large community is present on ≥ 95 % of expected peers throughout the window.


Implementation and Cutover

Enabling Large Communities on Network Devices

  1. OS upgrade / feature enable – If required, schedule a maintenance window to install a version that supports large communities. Verify with show version | include large-community.
  2. Global capability – Some platforms need explicit enablement (e.g., Cisco IOS‑XR: ensure bgp log-neighbor-changes is on for visibility; bgp graceful-restart does not affect large‑community support).
  3. Neighbor capability advertisement – Ensure send-community both or send-community extended is set; large communities are transmitted automatically when the attribute is present.

Configuring Dual‑Tag Windows

Executing Rollback Triggers

If any trigger condition fires:

  1. Automated detection – A monitoring script evaluates KPI thresholds and, upon breach, invokes the rollback playbook (e.g., Ansible bgp_large_community_rollback.yml).
  2. Manual execution – Operator runs:
    configure terminal
    no route-map DUAL_TAG_OUT permit 10
    route-map DUAL_TAG_OUT permit 10
      set community 65000:200 additive
    !
    router bgp 65000
     address-family ipv4 unicast
      neighbor 10.0.0.2 route-map DUAL_TAG_OUT out
     !
    exit
    # Push clean update
    clear bgp ipv4 unicast 10.0.0.2 soft out
  3. Verification – Run the validation commands from the Proof Points section to confirm the large community is gone and TE behavior reverted.

Troubleshooting and Validation

Common Issues and Debugging Techniques

SymptomLikely CauseDiagnostic Command
Large community not seen on peerNeighbor missing send-community both or OS version too old`show bgp neighbor received-routes
BGP session reset after attribute addedMalformed large‑community value (> 4 bytes)`show logging
TE KPI degradationIncorrect large‑community value causing unintended local‑pref/MEDCompare show route <prefix> detail before/after
High CPU on route‑processorExcessive community‑list processing due to large ACLs`show process cpu
Duplicate attributes causing confusionReceiver prefers standard community due to route‑map order`show route

CLI Examples for Troubleshooting

Validation of Traffic Engineering Behavior

  1. Baseline capture – Before enabling large community, record:
    show bgp ipv4 unicast <prefix> | include Neighbor,Local_Pref,MED   ping -c 100 -i 0.2 <destination>  # log latency/jitter
  2. During dual‑tag window – Sample every 30 s:
      timestamp=$(date +%s)
      bgp_out=$(show bgp ipv4 unicast <prefix> | awk '/^>/ {print $2,$5,$6}')
      ping_out=$(ping -c 20 -i 0.2 <destination> | tail -1)
      echo "$timestamp $bgp_out $ping_out" >> /var/log/te_validation.log
      sleep 30
    done
  3. Post‑window analysis – Compute deltas; if any exceed thresholds, trigger rollback.

Scaling Limitations and Considerations

Large Community Attribute Limits

Network Device Capacity and Performance


End of document.


Share this post on:

Previous Post
Type-2 vs Type-5 at Anycast IRB
Next Post
Designing a large-community schema that survives growth