Introduction to Access Control Lists
Manual ACL drafting involves writing access control entries directly in the native syntax of the enforcement device (e.g., Cisco IOS access-list, Linux iptables, AWS Security Group rules, or Juniper firewall filters). An engineer identifies the required traffic flow, determines the source/destination addresses, ports, protocols, and then crafts a rule such as:
ip access-list extended TEMP-SVC-ALLOW
permit tcp host 10.0.1.25 eq 443 host 10.0.2.100
deny ip any any
The rule is applied to an interface, VLAN, or security group via a CLI command or API call. In deny‑by‑default environments, the baseline is an implicit deny all at the end of each ACL, and every explicit permit must be justified and scoped.
Benefits and Drawbacks of Manual ACL Management
Benefits
- Immediate visibility: each line is human‑readable and can be audited directly on the device.
- No translation layer: the rule that is typed is exactly what the enforcement engine evaluates.
- Low tooling overhead: only a CLI or API client is required.
Drawbacks
- Error‑prone: typos in addresses, wildcard masks, or protocol numbers create unintended permits or blocks.
- No version control: changes are often made ad‑hoc, making rollback and audit trails difficult.
- Scaling limits: large numbers of rules increase evaluation latency and complicate conflict detection.
- Drift detection: manual changes bypass automated compliance checks, leading to policy‑as‑deployed diverging from policy‑as‑intended.
- Temporary access handling: engineers must remember to add expiry logic (e.g., time‑based ACLs) and later clean up rules, which is easily forgotten.
Policy-as-Code Compilation
Definition and Principles
Policy‑as‑Code (PaC) treats access control rules as declarative artifacts stored in version‑controlled repositories. A compiler or interpreter translates the high‑level description into the native ACL syntax of the target enforcement point. Core principles include:
- Idempotency: applying the same code repeatedly yields the same enforced state.
- Immutability: changes are made by committing new versions; the system never mutates in‑place without a new commit.
- Testability: unit and integration tests can validate that the compiled ACL meets service, source, expiry, and validation constraints before deployment.
- Traceability: each rule can be linked to a ticket, a change request, or a policy document via commit metadata.
Tools and Frameworks
| Tool/Framework | Target Enforcement | Language/DSL | Notable Features |
|---|---|---|---|
| Terraform (AWS provider) | AWS Security Groups, NACLs | HCL | State‑driven, plan/apply workflow, drift detection |
| AWS CDK | AWS Security Groups, NACLs | TypeScript/Python/Java/.NET | Constructs library, higher‑level abstractions |
| Cisco ACI Model | Cisco ACI contracts/filters | Python (acitoolkit) or YAML | Model‑driven, API‑centric |
| Juniper Contrail/Contrail Networking | Juniper security policies | YAML/Jinja | Service‑chain insertion, policy rendering |
| Open Policy Agent (OPA) | Generic (Envoy, Kubernetes, cloud‑native) | Rego | Policy evaluation as a service, pluggable adapters |
Ansible + ios_acl / nxos_acl modules | Cisco IOS/NX‑OS | YAML | Idempotent playbooks, check mode |
aclgen (community tool) | Various vendor ACLs | Python DSL | Compiles service‑source‑expiry tuples to vendor syntax |
Example: Terraform Configuration for Time‑Restricted Access
# variables.tf
variable "service_sg_id" {
description = "Security Group ID of the backend service"
type = string
}
variable "temp_cidr" {
description = "CIDR block of the source needing temporary access"
type = string
}
variable "start_time" {
description = "Unix timestamp when access begins"
type = number
}
variable "end_time" {
description = "Unix timestamp when access expires"
type = number
}
# main.tf
resource "aws_security_group_rule" "temp_api_allow" {
type = "ingress"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [var.temp_cidr]
security_group_id = var.service_sg_id
description = "Temporary API access for ${var.temp_cidr} from ${var.start_time} to ${var.end_time}"
# AWS SG does not natively support time‑based rules; we rely on
# a Lambda‑driven rotation that removes the rule after end_time.
# The rule is tagged for identification by the rotation Lambda.
tags = {
Purpose = "temporary-access"
SourceCIDR= var.temp_cidr
ExpiresAt = var.end_time
}
}
# rotation_lambda.tf (simplified)
resource "aws_lambda_function" "sg_rotator" {
filename = "sg_rotator.zip"
function_name = "sg_temporary_access_rotator"
role = aws_iam_role.lambda_exec.arn
runtime = "python3.9"
handler = "rotator.lambda_handler"
environment {
variables = {
TABLE_NAME = aws_dynamodb_rule_tracker.name
}
}
}
resource "aws_cloudwatch_event_rule" "sg_rotation_schedule" {
name = "sg-rotation-every-5-min"
schedule_expression = "rate(5 minutes)"
}
resource "aws_cloudwatch_event_target" "sg_rotation_target" {
rule = aws_cloudwatch_event_rule.sg_rotation_schedule.name
target_id = "SGRotator"
arn = aws_lambda_function.sg_rotator.arn
}
resource "aws_lambda_permission" "allow_cw" {
statement_id = "AllowExecutionFromCloudWatch"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.sg_rotator.function_name
principal = "events.amazonaws.com"
source_arn = aws_cloudwatch_event_rule.sg_rotation_schedule.arn
}
The Lambda (sg_rotator) queries a DynamoDB table that tracks temporary rules (populated at rule creation) and deletes any rule whose ExpiresAt timestamp is now past. This pattern enforces expiry without relying on native SG time‑based ACLs.
CLI Examples for Policy-as-Code Deployment
# Initialize Terraform workspace
terraform init
# Validate syntax and check for drift
terraform validate
terraform plan -out=tfplan
# Apply the plan (creates/updates SG rule)
terraform apply tfplan
# Destroy temporary access after manual approval (or let Lambda clean up)
terraform destroy -target=aws_security_group_rule.temp_api_allow
# Using AWS CDK (TypeScript) for the same goal
npm install -g aws-cdk
cdk init app --language typescript
# lib/temporary-access-stack.ts
import * as cdk from 'aws-cdk-lib';
import { Stack, StackProps, aws_ec2 as ec2 } from 'aws-cdk-lib';
export class TemporaryAccessStack extends Stack {
constructor(scope: cdk.App, id: string, props?: StackProps) {
super(scope, id, props);
const sg = ec2.SecurityGroup.fromSecurityGroupId(this, 'ServiceSG', props?.serviceSgId ?? '');
sg.addIngressRule(ec2.Peer.ipv4(props?.sourceCidr ?? ''), ec2.Port.tcp(443), 'Temporary API access');
}
}
# Deploy
cdk deploy
The workflow follows plan → review → apply → (optional) destroy, with automated expiry handled by a side‑car process.
LLM-Assisted Exception Authoring
Introduction
Large Language Models (LLMs) can synthesize policy snippets from natural‑language requests, reducing the cognitive load on engineers who need to craft temporary exceptions. In a deny‑by‑default setting, the LLM is prompted with constraints (service, source, expiry, validation tests) and outputs a candidate ACL rule or PaC fragment. The output is not applied directly; it undergoes deterministic validation (syntax check, conflict analysis, policy testing) before being committed to the policy repository.
Workflow and Benefits
- Request Capture – Engineer submits a ticket or chat message:
“Grant the monitoring team (10.10.20.0/24) temporary HTTPS access to the payment‑api service (sg‑pay‑api) for the next 4 hours, validate with curl test.” - Prompt Construction – System builds a prompt that includes:
- Base deny‑by‑default policy context.
- Extracted entities: service (
sg‑pay‑api), source (10.10.20.0/24), protocol/port (tcp/443), expiry (now + 4h). - Required validation test (e.g.,
nc -zvz 10.0.2.100 443must succeed within the window).
- LLM Inference – The model returns a candidate rule, e.g.:
resource "aws_security_group_rule" "temp_mon_https" { type = "ingress" from_port = 443 to_port = 443 protocol = "tcp" cidr_blocks = ["10.10.20.0/24"] security_group_id = aws_security_group.service_pay_api.id description = "Temporary HTTPS for monitoring team (expires at ${timestamp(var.expiry)})" tags = { Purpose = "temporary-access", ExpiresAt = var.expiry } } - Deterministic Validation – The candidate is:
- Syntax‑checked via
terraform validateorcdk synth. - Tested in a sandbox (e.g.,
checkovoropa test) for over‑permissiveness. - Verified against a policy matrix (service‑source‑expiry) using a unit test.
- Syntax‑checked via
- Human Review & Commit – Engineer reviews the diff, approves, and merges to
main. CI pipeline runs the validation steps again; on success, the change is deployed. - Automated Cleanup – A scheduled job (Lambda, CronJob, or Flux) removes the rule when the expiry timestamp passes.
Benefits
- Reduces manual lookup of object IDs and syntax.
- Encourages consistent tagging and metadata (expiry, ticket reference).
- Provides a natural‑language entry point for infrequent operators.
- Enables rapid generation of multiple similar exceptions (e.g., bulk onboarding).
Code Example: Integrating LLMs with Terraform Generation
import os
import json
import subprocess
import time
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
def build_prompt(request: str) -> str:
return f"""
You are a network policy engineer. Convert the following request into a Terraform HCL snippet for an AWS Security Group rule.
Constraints:
- Deny‑by‑default baseline.
- Must include source CIDR, destination service SG ID variable, protocol/port.
- Add a tag `ExpiresAt` with a Unix timestamp for the requested duration.
- Add a tag `Ticket` with the ticket ID if provided.
- Output ONLY the HCL block, no extra text.
Request: {request}
"""
def call_llm(prompt: str) -> str:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
temperature=0.0,
max_tokens=256,
)
return response.choices[0].message.content.strip()
def validate_terraform(hcl: str, workdir: str = "/tmp/tf_validate") -> bool:
os.makedirs(workdir, exist_ok=True)
tf_path = os.path.join(workdir, "main.tf")
with open(tf_path, "w") as f:
f.write(hcl)
# terraform init & validate
subprocess.run(["terraform", "init"], cwd=workdir, check=True, capture_output=True)
result = subprocess.run(
["terraform", "validate"],
cwd=workdir,
capture_output=True,
text=True,
)
if result.returncode != 0:
print("Validation failed:", result.stderr)
return False
return True
def main():
request = "Grant the monitoring team (10.10.20.0/24) temporary HTTPS access to the payment-api service for the next 4 hours, ticket INC12345."
prompt = build_prompt(request)
hcl = call_llm(prompt)
print("Generated HCL:\n", hcl)
if validate_terraform(hcl):
print("HCL passes validation – ready for PR.")
else:
print("HCL invalid – prompt engineering needed.")
if __name__ == "__main__":
main()