Skip to content
LinkState
Go back

One packet from netns process to host socket

Introduction to Network Namespaces and Veth Pairs

Network namespaces are a Linux kernel feature that allows for the creation of isolated network environments. Each namespace has its own set of network devices, routing tables, and network stacks, which are independent of the host system’s network configuration. This isolation enables the creation of multiple, independent network environments on a single host, making it ideal for scenarios such as containerization, virtualization, and network testing.

Overview of Network Namespaces

Network namespaces are created using the unshare command with the --net option, which unshares the network namespace from the parent namespace. The ip netns command can then be used to manage and configure the network namespace.

Veth Pair Configuration and Setup

A veth pair is a type of virtual network device that consists of two connected network interfaces. One interface is in the network namespace, and the other is in the host namespace. This allows for communication between the network namespace and the host system. To configure a veth pair, we use the ip link add command to create a new veth pair, and then use the ip link set command to move one of the interfaces into the network namespace. For example:

ip link add veth0 type veth peer name veth1
ip link set veth1 netns myns

This creates a veth pair with veth0 in the host namespace and veth1 in the myns network namespace.

Sendmsg System Call and SKB Creation

The sendmsg system call is used to send a message over a socket. It takes four arguments: the socket file descriptor, the message buffer, the message length, and a set of flags. When sendmsg is called, the kernel creates a new socket buffer (skb) to hold the message data. The skb is then passed through the network stack, where it is processed by various kernel modules and drivers.

SKB Allocation and Initialization

The skb is allocated using the skb_alloc function, which returns a pointer to a new skb structure. The skb is then initialized using the skb_init function, which sets up the skb’s headers and data buffers. The skb’s protocol headers, such as the IP and TCP headers, are built using the skb_push and skb_pull functions.

Packet Transmission Across Veth Pair

When the skb is ready to be transmitted, it is passed to the veth pair’s transmit function. The transmit function takes the skb and sends it over the veth pair to the receiving interface. The receiving interface then receives the skb and passes it to the network stack for processing.

Packet Reception in the Host Stack

When the skb is received by the host stack, it is processed by the network stack’s receive function. The receive function performs tasks such as checksum verification, routing, and packet filtering on the skb. The skb is then delivered to the destination socket, where it is received by the application using the recvmsg system call.

Routing Decision Making

When a packet is received by the network stack, the kernel performs a routing table lookup to determine where to forward the packet. The routing table is a data structure that maps destination IP addresses to next-hop IP addresses and network interfaces. The kernel uses the ip_route_output function to perform the routing table lookup.

Routing Cache and Fast Path

To improve performance, the kernel uses a routing cache to store recently used routing entries. The routing cache is a data structure that maps destination IP addresses to next-hop IP addresses and network interfaces. When a packet is received, the kernel checks the routing cache first to see if there is a valid entry for the packet’s destination IP address. If there is, the kernel uses the cached entry to forward the packet.

Checksum Handling and Verification

When a packet is transmitted, the kernel calculates the TCP or UDP checksum for the packet. The checksum is calculated using the csum_tcpudp_magic function, which takes the packet’s data buffer and returns the checksum value.

Checksum Verification and Validation

When a packet is received, the kernel verifies the checksum by recalculating it using the csum_tcpudp_magic function. If the recalculated checksum matches the stored checksum, the packet is considered valid. If the checksums do not match, the packet is considered corrupted and is discarded.

Netfilter Decision Making and Packet Filtering

Netfilter is a kernel module that provides packet filtering and modification capabilities. Netfilter hooks are points in the network stack where packet processing can be injected. When a packet is received, the kernel calls the Netfilter hooks to allow packet processing to occur.

NFQUEUE and Packet Filtering

NFQUEUE is a Netfilter module that allows packets to be queued for user-space processing. When a packet is received, the kernel can queue it for NFQUEUE processing. The NFQUEUE module then passes the packet to a user-space application, which can modify or discard the packet based on user-defined rules.

Troubleshooting Packet Loss and Corruption

Tcpdump and Wireshark are tools used to capture and analyze network packets. They can be used to troubleshoot packet loss and corruption by capturing packets at various points in the network stack.

Debugging Netfilter and Routing Issues

To debug Netfilter and routing issues, the kernel’s debug logging can be enabled using the sysctl command. This allows for detailed logging of packet processing and routing decisions. Additionally, the tcpdump command can be used to capture packets at various points in the network stack, allowing for analysis of packet processing and routing decisions.

Code Examples and CLI Commands

Using Socket API for Sendmsg

The following code example demonstrates how to use the socket API to send a message over a socket:

#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

int main() {
    int sockfd;
    struct sockaddr_in servaddr;

    // Create a socket
    sockfd = socket(AF_INET, SOCK_STREAM, 0);

    // Set up the server address
    servaddr.sin_family = AF_INET;
    servaddr.sin_port = htons(8080);
    inet_pton(AF_INET, "127.0.0.1", &servaddr.sin_addr);

    // Connect to the server
    connect(sockfd, (struct sockaddr *)&servaddr, sizeof(servaddr));

    // Send a message
    char *msg = "Hello, world!";
    sendmsg(sockfd, msg, strlen(msg), 0);

    return 0;
}

Configuring Veth Pair and Network Namespace

The following CLI commands demonstrate how to configure a veth pair and network namespace:

ip link add veth0 type veth peer name veth1
ip link set veth1 netns myns
ip netns exec myns ip addr add 10.0.0.1/24 dev veth1
ip link set veth0 up
ip netns exec myns ip link set veth1 up

Scaling Limitations and Performance Considerations

Veth pairs and network namespaces can be scaled by creating multiple veth pairs and network namespaces. However, this can lead to increased complexity and overhead. To improve performance, the kernel’s network stack can be optimized using various techniques such as packet batching and interrupt coalescing.

Optimizing Netfilter and Routing Performance

Netfilter and routing performance can be optimized by using various techniques such as caching and fast path processing. The kernel’s Netfilter module can be optimized by using the nfqueue module to queue packets for user-space processing.

Advanced Topics and Edge Cases

Handling VLANs and MPLS in Veth Pairs

Veth pairs can be used to handle VLANs and MPLS by using the vlan and mpls modules. The vlan module can be used to tag packets with VLAN IDs, while the mpls module can be used to encapsulate packets with MPLS labels.

Netfilter and Routing in Complex Network Topologies

Netfilter and routing can be used in complex network topologies by using various techniques such as packet filtering and routing table manipulation.

CLI Examples for Troubleshooting and Verification

Using Ip and Tcpdump Commands for Verification

The following CLI commands demonstrate how to use the ip and tcpdump commands to verify packet transmission and reception:

ip link show veth0
tcpdump -i veth0 -n -vv -s 0 -c 100

Debugging Network Issues with Sysctl and Procfs

The following CLI commands demonstrate how to use the sysctl and procfs commands to debug network issues:

sysctl -w net.ipv4.ip_forward=1
cat /proc/sys/net/ipv4/ip_forward

Netfilter and Routing Configuration Examples

Configuring Netfilter Rules and Chains

The following CLI commands demonstrate how to configure Netfilter rules and chains:

iptables -A INPUT -p tcp --dport 8080 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 8080 -j ACCEPT

Setting Up Routing Tables and Rules

The following CLI commands demonstrate how to set up routing tables and rules:

ip route add 10.0.0.0/24 via 10.0.0.1 dev veth0
ip route add 10.0.0.1/32 via 10.0.0.1 dev veth0

Share this post on:

Previous Post
Userspace routers change the TUN packet path
Next Post
Tracing Prompt-to-Command Drift in NetDevOps Loops