Skip to content
LinkState
Go back

Deny-by-default exceptions without opening the subnet

Introduction to Access Control Lists

Manual ACL drafting involves writing access control entries directly in the native syntax of the enforcement device (e.g., Cisco IOS access-list, Linux iptables, AWS Security Group rules, or Juniper firewall filters). An engineer identifies the required traffic flow, determines the source/destination addresses, ports, protocols, and then crafts a rule such as:

ip access-list extended TEMP-SVC-ALLOW
 permit tcp host 10.0.1.25 eq 443 host 10.0.2.100
 deny   ip any any

The rule is applied to an interface, VLAN, or security group via a CLI command or API call. In deny‑by‑default environments, the baseline is an implicit deny all at the end of each ACL, and every explicit permit must be justified and scoped.

Benefits and Drawbacks of Manual ACL Management

Benefits

Drawbacks

Policy-as-Code Compilation

Definition and Principles

Policy‑as‑Code (PaC) treats access control rules as declarative artifacts stored in version‑controlled repositories. A compiler or interpreter translates the high‑level description into the native ACL syntax of the target enforcement point. Core principles include:

Tools and Frameworks

Tool/FrameworkTarget EnforcementLanguage/DSLNotable Features
Terraform (AWS provider)AWS Security Groups, NACLsHCLState‑driven, plan/apply workflow, drift detection
AWS CDKAWS Security Groups, NACLsTypeScript/Python/Java/.NETConstructs library, higher‑level abstractions
Cisco ACI ModelCisco ACI contracts/filtersPython (acitoolkit) or YAMLModel‑driven, API‑centric
Juniper Contrail/Contrail NetworkingJuniper security policiesYAML/JinjaService‑chain insertion, policy rendering
Open Policy Agent (OPA)Generic (Envoy, Kubernetes, cloud‑native)RegoPolicy evaluation as a service, pluggable adapters
Ansible + ios_acl / nxos_acl modulesCisco IOS/NX‑OSYAMLIdempotent playbooks, check mode
aclgen (community tool)Various vendor ACLsPython DSLCompiles service‑source‑expiry tuples to vendor syntax

Example: Terraform Configuration for Time‑Restricted Access

# variables.tf
variable "service_sg_id" {
  description = "Security Group ID of the backend service"
  type        = string
}
variable "temp_cidr" {
  description = "CIDR block of the source needing temporary access"
  type        = string
}
variable "start_time" {
  description = "Unix timestamp when access begins"
  type        = number
}
variable "end_time" {
  description = "Unix timestamp when access expires"
  type        = number
}

# main.tf
resource "aws_security_group_rule" "temp_api_allow" {
  type                     = "ingress"
  from_port                = 443
  to_port                  = 443
  protocol                 = "tcp"
  cidr_blocks              = [var.temp_cidr]
  security_group_id        = var.service_sg_id
  description              = "Temporary API access for ${var.temp_cidr} from ${var.start_time} to ${var.end_time}"

  # AWS SG does not natively support time‑based rules; we rely on
  # a Lambda‑driven rotation that removes the rule after end_time.
  # The rule is tagged for identification by the rotation Lambda.
  tags = {
    Purpose   = "temporary-access"
    SourceCIDR= var.temp_cidr
    ExpiresAt = var.end_time
  }
}

# rotation_lambda.tf (simplified)
resource "aws_lambda_function" "sg_rotator" {
  filename      = "sg_rotator.zip"
  function_name = "sg_temporary_access_rotator"
  role          = aws_iam_role.lambda_exec.arn
  runtime       = "python3.9"
  handler       = "rotator.lambda_handler"
  environment {
    variables = {
      TABLE_NAME = aws_dynamodb_rule_tracker.name
    }
  }
}

resource "aws_cloudwatch_event_rule" "sg_rotation_schedule" {
  name                = "sg-rotation-every-5-min"
  schedule_expression = "rate(5 minutes)"
}

resource "aws_cloudwatch_event_target" "sg_rotation_target" {
  rule      = aws_cloudwatch_event_rule.sg_rotation_schedule.name
  target_id = "SGRotator"
  arn       = aws_lambda_function.sg_rotator.arn
}

resource "aws_lambda_permission" "allow_cw" {
  statement_id  = "AllowExecutionFromCloudWatch"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.sg_rotator.function_name
  principal     = "events.amazonaws.com"
  source_arn    = aws_cloudwatch_event_rule.sg_rotation_schedule.arn
}

The Lambda (sg_rotator) queries a DynamoDB table that tracks temporary rules (populated at rule creation) and deletes any rule whose ExpiresAt timestamp is now past. This pattern enforces expiry without relying on native SG time‑based ACLs.

CLI Examples for Policy-as-Code Deployment

# Initialize Terraform workspace
terraform init

# Validate syntax and check for drift
terraform validate
terraform plan -out=tfplan

# Apply the plan (creates/updates SG rule)
terraform apply tfplan

# Destroy temporary access after manual approval (or let Lambda clean up)
terraform destroy -target=aws_security_group_rule.temp_api_allow

# Using AWS CDK (TypeScript) for the same goal
npm install -g aws-cdk
cdk init app --language typescript
# lib/temporary-access-stack.ts
import * as cdk from 'aws-cdk-lib';
import { Stack, StackProps, aws_ec2 as ec2 } from 'aws-cdk-lib';

export class TemporaryAccessStack extends Stack {
  constructor(scope: cdk.App, id: string, props?: StackProps) {
    super(scope, id, props);
    const sg = ec2.SecurityGroup.fromSecurityGroupId(this, 'ServiceSG', props?.serviceSgId ?? '');
    sg.addIngressRule(ec2.Peer.ipv4(props?.sourceCidr ?? ''), ec2.Port.tcp(443), 'Temporary API access');
  }
}
# Deploy
cdk deploy

The workflow follows plan → review → apply → (optional) destroy, with automated expiry handled by a side‑car process.

LLM-Assisted Exception Authoring

Introduction

Large Language Models (LLMs) can synthesize policy snippets from natural‑language requests, reducing the cognitive load on engineers who need to craft temporary exceptions. In a deny‑by‑default setting, the LLM is prompted with constraints (service, source, expiry, validation tests) and outputs a candidate ACL rule or PaC fragment. The output is not applied directly; it undergoes deterministic validation (syntax check, conflict analysis, policy testing) before being committed to the policy repository.

Workflow and Benefits

  1. Request Capture – Engineer submits a ticket or chat message:
    “Grant the monitoring team (10.10.20.0/24) temporary HTTPS access to the payment‑api service (sg‑pay‑api) for the next 4 hours, validate with curl test.”
  2. Prompt Construction – System builds a prompt that includes:
    • Base deny‑by‑default policy context.
    • Extracted entities: service (sg‑pay‑api), source (10.10.20.0/24), protocol/port (tcp/443), expiry (now + 4h).
    • Required validation test (e.g., nc -zvz 10.0.2.100 443 must succeed within the window).
  3. LLM Inference – The model returns a candidate rule, e.g.:
    resource "aws_security_group_rule" "temp_mon_https" {
      type              = "ingress"
      from_port         = 443
      to_port           = 443
      protocol          = "tcp"
      cidr_blocks       = ["10.10.20.0/24"]
      security_group_id = aws_security_group.service_pay_api.id
      description       = "Temporary HTTPS for monitoring team (expires at ${timestamp(var.expiry)})"
      tags = { Purpose = "temporary-access", ExpiresAt = var.expiry }
    }
  4. Deterministic Validation – The candidate is:
    • Syntax‑checked via terraform validate or cdk synth.
    • Tested in a sandbox (e.g., checkov or opa test) for over‑permissiveness.
    • Verified against a policy matrix (service‑source‑expiry) using a unit test.
  5. Human Review & Commit – Engineer reviews the diff, approves, and merges to main. CI pipeline runs the validation steps again; on success, the change is deployed.
  6. Automated Cleanup – A scheduled job (Lambda, CronJob, or Flux) removes the rule when the expiry timestamp passes.

Benefits

Code Example: Integrating LLMs with Terraform Generation

import os
import json
import subprocess
import time
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def build_prompt(request: str) -> str:
    return f"""
You are a network policy engineer. Convert the following request into a Terraform HCL snippet for an AWS Security Group rule.
Constraints:
- Deny‑by‑default baseline.
- Must include source CIDR, destination service SG ID variable, protocol/port.
- Add a tag `ExpiresAt` with a Unix timestamp for the requested duration.
- Add a tag `Ticket` with the ticket ID if provided.
- Output ONLY the HCL block, no extra text.

Request: {request}
"""

def call_llm(prompt: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.0,
        max_tokens=256,
    )
    return response.choices[0].message.content.strip()

def validate_terraform(hcl: str, workdir: str = "/tmp/tf_validate") -> bool:
    os.makedirs(workdir, exist_ok=True)
    tf_path = os.path.join(workdir, "main.tf")
    with open(tf_path, "w") as f:
        f.write(hcl)
    # terraform init & validate
    subprocess.run(["terraform", "init"], cwd=workdir, check=True, capture_output=True)
    result = subprocess.run(
        ["terraform", "validate"],
        cwd=workdir,
        capture_output=True,
        text=True,
    )
    if result.returncode != 0:
        print("Validation failed:", result.stderr)
        return False
    return True

def main():
    request = "Grant the monitoring team (10.10.20.0/24) temporary HTTPS access to the payment-api service for the next 4 hours, ticket INC12345."
    prompt = build_prompt(request)
    hcl = call_llm(prompt)
    print("Generated HCL:\n", hcl)
    if validate_terraform(hcl):
        print("HCL passes validation – ready for PR.")
    else:
        print("HCL invalid – prompt engineering needed.")

if __name__ == "__main__":
    main()

Share this post on:

Previous Post
Bringing mirrored production traffic into a lab safely
Next Post
Why the workbench picked the wrong fix