Introduction to Retransmission Storms
Definition and Causes
A retransmission storm occurs when an endpoint repeatedly retransmits packets that are dropped because an intermediate stateful device (NAT, firewall, load‑balancer) has removed its translation or connection‑tracking entry while the flow is still active. The surviving host interprets the lack of ACK as loss and triggers TCP retransmission timers; if the state remains absent, an exponential back‑off loop ensues, consuming bandwidth and CPU.
Typical causes
- Idle‑timeout expiration of NAT or conntrack entries while the application holds a TCP socket open (e.g., long‑lived DB connections, SSH tunnels, SIP media).
- Asymmetric routing causing the return path to bypass the stateful device, so the forward direction sees state while the reverse does not.
- Misaligned timeout values between endpoints and middleboxes (e.g., application keep‑alive interval longer than NAT TCP timeout).
- Stateful device overload causing premature entry eviction (hash‑table limits, memory pressure).
Impact on Network Performance
- Excess retransmits inflate link utilization; a single stalled flow can generate dozens of retries per second.
- Increased latency for all traffic sharing the same queue due to bufferbloat from retransmit bursts.
- CPU spikes on the stateful device as it processes repeated SYN/ACK or RST packets looking for missing state.
- Application‑level timeouts and failed transactions, often mistaken for application bugs.
- Potential denial‑of‑service if the storm saturates the uplink or overwhelms the stateful device’s forwarding path.
Understanding Translation State Expiration
Translation State Overview
Translation state is the data structure a middlebox creates to map an internal address:port to an external address:port (NAT) or to track a connection’s lifecycle (conntrack, stateful firewall). For TCP, the entry typically contains:
- 5‑tuple (src IP, dst IP, src port, dst port, protocol)
- Sequence‑number window
- Timestamp of last seen packet
- Timeout value specific to TCP state (e.g., ESTABLISHED, FIN_WAIT)
Expiration Mechanisms and Timers
Most stateful implementations use idle timers that reset on each packet seen in either direction. When the timer expires, the entry is removed and any subsequent packet is treated as new or invalid.
Linux netfilter/conntrack
/proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established– default 432000 s (5 days) but can be lowered by administrators or container runtimes.- Separate timers for
SYN_SENT,SYN_RECV,FIN_WAIT,TIME_WAIT, etc. - UDP has a much shorter default (
nf_conntrack_udp_timeout= 30 s). - When an entry expires, the conntrack module returns
NF_DROPfor packets belonging to that tuple unless a new SYN creates a fresh entry.
Cisco ASA/FTD
timeout conn(default 1 h) for TCP,timeout udp(default 2 min).timeout half-closedfor TCP half‑close state.
Palo Alto
session timeoutfor TCP (default 3600 s) and UDP (default 60 s).session aged-outlogs when idle timeout triggers.
Juniper SRX
flow tcp-session(default 1800 s) andflow udp-session(default 60 s).
If no packet matches the tuple before the timer fires, the state is cleared. Subsequent packets from the endpoint that still believes the connection is alive are treated as out‑of‑state and are either dropped or, in the case of NAT, cause address translation failure leading to ICMP unreachable or TCP RST.
Effects of Idle Periods on Translation State
During an idle period, no packets reset the timer. If the idle interval exceeds the configured timeout, the translation/conntrack entry disappears. When the application resumes sending data:
- The first packet may be a data segment (not SYN).
- The stateful device sees an unknown tuple → drops the packet (or sends ICMP port‑unreachable).
- The sender’s TCP stack, missing an ACK, retransmits after RTO (initially ~1 s, then exponential).
- Because the state is still missing, each retransmission meets the same fate, creating a storm until either:
- The application aborts the connection, or
- A keep‑alive or new SYN arrives, creating a fresh state entry.
Identifying Retransmission Storms
Symptoms and Indicators
- TCP duplicate ACKs and fast retransmits visible in packet captures.
- Retransmission rate > 10 % of total TCP traffic on a flow (measured via
ss -iornetstat -s). - ICMP destination unreachable (port unreachable) or ICMP time exceeded spikes from the stateful device.
- CPU utilization on NAT/firewall spikes correlating with bursty traffic.
- Application logs showing “connection reset by peer” or “operation timed out” after periods of inactivity.
- Retransmission storm detection in IDS/IPS (e.g., Snort rule
ET POLICY TCP Retransmission Storm).
Diagnostic Tools and Techniques
| Tool | Usage | What to Look For |
|---|---|---|
| `tcpdump -i any -nn -s0 -w /tmp/storm.pcap ‘tcp[tcpflags] & (tcp-syn | tcp-ack | tcp-rst) != 0’` |
ss -ti state established '( dport = :22 or sport = :22 )' | Show per‑socket TCP info | retransmits field rising rapidly. |
conntrack -L -p tcp --dport 22 | List conntrack entries for SSH | Entries disappearing after idle period; timeout field near zero. |
iptables -L -v -n -t nat | View NAT counters | pkts increasing on MASQUERADE but bytes low due to drops. |
nft list ruleset | nftables equivalent | Same as above. |
tcpick -C -yP -r /tmp/storm.pcap | Re‑assemble streams | Application data missing after idle gap. |
ethtool -S eth0 | NIC stats | Rising tx_retransmits or rx_drop. |
prometheus node_exporter + netstat alerts | Long‑term monitoring | Alert on node_tcp_retransmits_total rate > threshold. |
Log Analysis and Error Messages
- Linux kernel (
dmesgor/var/log/kern.log):
nf_ct_ftp: dropping packet proto=TCP src=10.0.0.5 dst=203.0.113.10 sport=54321 dport=22 state=INVALID
Indicates conntrack saw a packet for a non‑existent entry. - Cisco ASA:
%ASA-4-106015: Deny TCP (no connection) from 10.0.0.5/54321 to 203.0.113.10/22 flags=ACK on interface outside
Shows ACK received without existing connection. - Palo Alto:
session end reason: aged-outfollowed bysession end reason: retransmission timeout
Correlates idle timeout with subsequent retransmits. - Juniper SRX:
flow_session_timeout: TCP session timed out
Look for bursts of these messages coinciding with retransmission spikes.
Troubleshooting Retransmission Storms
Step‑by‑Step Troubleshooting Process
- Confirm the symptom – Capture traffic on both sides of the stateful device; verify retransmits occur only after an idle gap.
- Locate the stateful device – Identify where NAT/conntrack/firewall sits (traceroute,
ip route get, or ACL logs). - Check timeout values – Retrieve the relevant idle timers (see section 4). Compare with observed idle period.
- Correlate logs – Match timestamp of state expiration log with first retransmit.
- Validate symmetry – Ensure forward and reverse paths traverse the same stateful node (check for asymmetric routing, ECMP, or policy‑based routing).
- Test with a keep‑alive – Send a minimal packet (e.g., TCP zero‑window probe or application‑level keep‑alive) shorter than the timeout; observe if storm disappears.
- Adjust or workaround – Increase timeout, enable TCP keep‑alives on hosts, or implement idle‑timeout bypass (e.g.,
iptables -t raw -I PREROUTING -p tcp --dport 22 -j NOTRACKfor specific flows). - Verify – Repeat capture; retransmits should drop to baseline (< 1 %).
- Document – Record original and new timeout values, reason for change, and any side effects.
Common Causes and Solutions
| Cause | Symptom | Fix |
|---|---|---|
| NAT TCP timeout too short (e.g., 30 s) | Storm after ~30 s idle | Increase /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established to match application keep‑alive or raise to ≥ 1 h. |
| Application lacks keep‑alive | Storm after any idle > timeout | Enable TCP keep‑alive (net.ipv4.tcp_keepalive_time=60, net.ipv4.tcp_keepalive_intvl=10, net.ipv4.tcp_keepalive_probes=6) or use application‑level keep‑alive. |
| Asymmetric routing causing state loss on return path | Only outbound retransmits, inbound ACKs arrive | Symmetrize routing (static routes, policy‑based routing, or disable ECMP for affected flows). |
| Stateful device overload dropping entries early | Storm under high connection count | Increase conntrack hash size (net.netfilter.nf_conntrack_max) or upgrade hardware; enable nf_conntrack_expect_max if needed. |
| Mis‑matched UDP timeout (e.g., 5 s) for media streams | Storm on RTP silence periods | Raise UDP timeout (nf_conntrack_udp_timeout) or enable udp timeout never on firewall for media ports. |
Advanced Troubleshooting Techniques
- eBPF tracing – Use
bpftraceto tracenf_ct_expireandnf_ct_deleteevents:bpftrace -e 'tracepoint:netfilter:nf_ct_expire { printf("%s %lu->%lu expired\n", comm, ntohs(args->tuple.src.u3.all), ntohs(args->tuple.dst.u3.all)); }' - TCPInfo sysctl – Dump per‑socket TCPInfo via
/proc/<pid>/fd/andgetsockopt(TCP_INFO)to see iftcpi_retransmitsclimbs whiletcpi_statestaysTCP_ESTABLISHED. - Packet generator – Use
hping3ornemesisto simulate idle periods and verify storm threshold:hping3 -S -p 22 -i u1000000 203.0.113.10 # 1‑second interval SYN, no data - Conntrack expectations – For FTP/SIP, ensure helper expectations are not timing out prematurely (
nf_conntrack_expect_max).
Configuring Translation State Timeout Values
Overview of Timeout Values and Settings
Timeout values dictate how long a state entry survives without seeing a packet. They are protocol‑ and state‑specific. Adjusting them prevents premature expiration while balancing memory usage.
Key knobs (Linux)
net.netfilter.nf_conntrack_tcp_timeout_established– ESTABLISHED state.net.netfilter.nf_conntrack_tcp_timeout_time_wait– TIME_WAIT.net.netfilter.nf_conntrack_udp_timeout– UDP.net.netfilter.nf_conntrack_udp_timeout_stream– UDP seen as a stream (e.g., SIP).net.netfilter.nf_conntrack_generic_timeout– fallback for unknown protocols.
On firewalls, similar timers exist under timeout commands (ASA, PAN‑OS, SRX).
CLI Examples for Configuring Timeout Values
Linux (sysctl)
# View current TCP established timeout (seconds)
sysctl net.netfilter.nf_conntrack_tcp_timeout_established
# Set to 2 hours (7200s) – persists until reboot
sysctl -w net.netfilter.nf_conntrack_tcp_timeout_established=7200
# Make permanent
echo "net.netfilter.nf_conntrack_tcp_timeout_established=7200" >> /etc/sysctl.d/99-conntrack.conf
sysctl -p /etc/sysctl.d/99-conntrack.conf
Linux (nftables)
# nftables can expose conntrack limits via the 'ct' timeout table (kernel 5.6+)
nft add table ip filter
nft add chain ip filter input { type filter hook input priority 0 \; }
# Example: set TCP established timeout to 7200s for packets matching port 22
nft add rule ip filter input tcp dport 22 ct timeout set 7200
Cisco ASA
# Show current timeout
show running-config all | include timeout
# Change TCP timeout to 2 hours
timeout conn 7200
Palo Alto (PAN‑OS)
# Configure via CLI
configure
set deviceconfig setting session tcp-timeout 7200
commit
Juniper SRX
# Set TCP session timeout to 2 hours
set security flow tcp-session timeout 7200
commit