Skip to content
LinkState
Go back

TextFSM, parsers, and LLMs on the same CLI mess

Introduction to Template-Based and LLM-Assisted Extraction

The extraction of relevant data from multi-vendor show interface outputs is a critical task in network operations, requiring high accuracy and reliability. With the increasing complexity of network devices and the variety of output formats, traditional methods of data extraction are facing significant challenges. This article compares three approaches to address these challenges: deterministic templates, parser libraries, and LLM-assisted extraction.

Overview of Approaches

Deterministic Templates

Deterministic templates are predefined patterns used to extract specific data from show interface outputs. These templates are typically designed to match the exact format of the output, allowing for precise extraction of relevant information. Deterministic templates are widely used due to their simplicity and effectiveness in extracting data from well-structured outputs.

Parser Libraries

Parser libraries are software components that provide a structured way to parse and extract data from show interface outputs. These libraries often include a set of predefined parsing rules and can be customized to handle specific output formats. Parser libraries offer a more flexible approach than deterministic templates, as they can adapt to variations in output formats.

LLM-Assisted Extraction

LLM-assisted extraction utilizes Large Language Models (LLMs) to extract data from show interface outputs. LLMs are trained on vast amounts of text data and can learn to identify patterns and relationships within the data. This approach offers a high degree of flexibility and can handle complex, unstructured, or variable output formats. However, LLM-assisted extraction requires significant computational resources and may introduce additional complexity.

Deterministic Templates

Deterministic templates are a straightforward approach to extracting data from show interface outputs. They offer several advantages, including simplicity, speed, and accuracy, but also have some limitations.

Advantages and Disadvantages

Example Use Cases and Code

Deterministic templates are ideal for extracting data from devices with well-documented and consistent output formats. They are commonly used in network monitoring tools and scripts that require fast and accurate data extraction.

import re

# Example output from a show interface command
output = """Interface IP-Address OK? Method Status Protocol
GigabitEthernet1 10.10.10.1 YES NVRAM up up
GigabitEthernet2 10.10.10.2 YES NVRAM down down"""

# Define a deterministic template as a regular expression
template = r"GigabitEthernet(\d+)\s+([0-9\.]+)\s+YES\s+NVRAM\s+(up|down)\s+(up|down)"

# Extract data using the template
matches = re.findall(template, output)

# Print the extracted data
for match in matches:
    print(f"Interface: GigabitEthernet{match[0]}, IP: {match[1]}, Status: {match[2]} {match[3]}")

Parser Libraries

Parser libraries offer a more flexible approach to data extraction by providing a set of rules that can be applied to parse and extract data from show interface outputs.

Advantages and Disadvantages

Example Use Cases and Code

Parser libraries are suitable for extracting data from devices with output formats that may vary slightly but still follow a structured pattern. They are useful in network management systems that need to support a wide range of devices.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ParserLibrary {
    public static void main(String[] args) {
        String output = """Interface IP-Address OK? Method Status Protocol
GigabitEthernet1 10.10.10.1 YES NVRAM up up
GigabitEthernet2 10.10.10.2 YES NVRAM down down""";

        Pattern pattern = Pattern.compile("GigabitEthernet(\\d+)\\s+([0-9\\.]+)\\s+YES\\s+NVRAM\\s+(up|down)\\s+(up|down)");
        Matcher matcher = pattern.matcher(output);

        while (matcher.find()) {
            System.out.println("Interface: GigabitEthernet" + matcher.group(1) + ", IP: " + matcher.group(2) + ", Status: " + matcher.group(3) + " " + matcher.group(4));
        }
    }
}

LLM-Assisted Extraction

LLM-assisted extraction leverages the capabilities of Large Language Models to extract data from show interface outputs. This approach can handle complex and variable output formats but requires significant computational resources.

Advantages and Disadvantages

Example Use Cases and Code

LLM-assisted extraction is particularly useful for extracting data from devices with highly variable or unstructured output formats. It’s beneficial in scenarios where the output format may change frequently or where a high degree of accuracy is required.

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load pre-trained LLM model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("llm_model")
tokenizer = AutoTokenizer.from_pretrained("llm_model")

# Example output from a show interface command
output = """Interface IP-Address OK? Method Status Protocol
GigabitEthernet1 10.10.10.1 YES NVRAM up up
GigabitEthernet2 10.10.10.2 YES NVRAM down down"""

# Preprocess the output
inputs = tokenizer(output, return_tensors="pt")

# Use the LLM to extract data
outputs = model(**inputs)

# Process the LLM outputs to extract relevant data
extracted_data = torch.argmax(outputs.logits).item()
print(f"Extracted Data: {extracted_data}")

Comparison of Approaches

Comparing deterministic templates, parser libraries, and LLM-assisted extraction involves evaluating their performance in terms of schema fidelity, abstention behavior, and operator trust.

Schema Fidelity Comparison

Abstention Behavior Comparison

Operator Trust Comparison

Troubleshooting Common Issues

Common issues in data extraction from show interface outputs include handling terminal formatting drift, multi-vendor show interface outputs, and messy output data.

Handling Terminal Formatting Drift

Handling Multi-Vendor Show Interface Outputs

Handling Messy Output Data

Scaling Limitations

Each approach has scaling limitations that affect its performance as the volume of data, diversity of output formats, or complexity of the extraction task increases.

Scaling Limitations of Deterministic Templates

Scaling Limitations of Parser Libraries

Scaling Limitations of LLM-Assisted Extraction

Best Practices for Implementation

Best practices for implementing deterministic templates, parser libraries, and LLM-assisted extraction include careful planning, testing, and maintenance.

Best Practices for Deterministic Templates

Best Practices for Parser Libraries

Best Practices for LLM-Assisted Extraction

The future of data extraction from show interface outputs is likely to involve further integration of AI and machine learning technologies, such as LLMs, to improve flexibility, accuracy, and scalability.

Future Directions for Parser Libraries and LLM-Assisted Extraction


Share this post on:

Previous Post
From raw CLI to stable entity graphs
Next Post
From flood and learn to proxy resolution