โ† all lessons/๐Ÿ“„ Other/#00

AWS VPC & Networking

Amazon VPC (Virtual Private Cloud) is the network foundation every AWS workload runs on. It is a logically isolated...

๐Ÿ“„ OtherIntermediate~31 min read

The 30-Second Pitch

Amazon VPC (Virtual Private Cloud) is the network foundation every AWS workload runs on. It is a logically isolated section of the AWS cloud where you define your own IP address space, subnets, route tables, and gateways โ€” complete control over who can reach what, from where. When the question in an interview or a production incident is "why can't service A talk to service B?", the answer is almost always in a VPC construct: a missing route, a security group rule pointing at the wrong resource, a NACL blocking return traffic, or a missing endpoint forcing traffic through a NAT Gateway you didn't expect.

The deceptively simple primitives โ€” subnets, route tables, security groups, NACLs โ€” compose into complex topologies spanning multiple accounts, regions, and on-premises data centers. The patterns are well-established (public/private/data subnets per AZ; Transit Gateway for hub-and-spoke; PrivateLink for service exposure) but the gotchas are plentiful: NACLs are stateless, VPC peering is non-transitive, NAT Gateway costs are a common budget surprise, and DNS resolution requires two VPC settings to be enabled simultaneously.

For any distributed systems, platform, or senior cloud engineer interview, you are expected to design a multi-AZ 3-tier VPC from scratch, explain the difference between every connectivity option from peering to Direct Connect, and articulate the cost and security trade-offs at each layer. This article covers all of it.


How It Actually Works

Mental Model: Packets and Decisions

Every packet traversing your VPC hits a decision tree:

Packet enters subnet
        โ”‚
        โ–ผ
   NACL Inbound?  โ”€โ”€โ”€โ”€ DENY โ”€โ”€โ”€โ”€โ–บ Drop
        โ”‚ ALLOW
        โ–ผ
  Security Group?  โ”€โ”€โ”€โ”€ DENY โ”€โ”€โ”€โ”€โ–บ Drop
        โ”‚ ALLOW
        โ–ผ
  Route Table  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ local / IGW / NAT GW / TGW / VGW / endpoint
        โ”‚
        โ–ผ
   Destination
        โ”‚
        โ–ผ
  Security Group (return traffic โ€” stateful: auto-allowed)
        โ”‚
        โ–ผ
   NACL Outbound? โ”€โ”€โ”€โ”€ DENY โ”€โ”€โ”€โ”€โ–บ Drop
        โ”‚ ALLOW
        โ–ผ
   Packet delivered

The critical insight: Security Groups are stateful (return traffic is automatically allowed), NACLs are stateless (you must explicitly allow return traffic including ephemeral ports 1024โ€“65535). Route tables use longest-prefix match โ€” a /28 route always wins over a /16 route for a matching destination. The local route (VPC CIDR โ†’ local) is always present and cannot be deleted or overridden.


1. VPC Fundamentals

CIDR Design

A VPC spans the entire region and gets one or more CIDR blocks (IPv4 and/or IPv6). RFC 1918 private ranges are standard:

RangeSizeNotes
10.0.0.0/816.7M IPsToo broad for a single VPC; pick a /16 or /20 slice
172.16.0.0/121M IPsCommon for secondary CIDRs
192.168.0.0/1665,536 IPsCommon in small environments

VPC sizing rules of thumb:

  • Use /16 for the VPC (65,536 IPs) โ€” gives room for many subnets across many AZs
  • Use /24 for subnets (~256 IPs; 251 usable โ€” AWS reserves 5 per subnet)
  • Reserve the bottom of your range: AWS reserves .0 (network), .1 (VPC router), .2 (DNS), .3 (future), .255 (broadcast)
  • Avoid CIDR blocks that overlap with corporate networks or other VPCs you will peer with โ€” overlapping CIDRs prevent VPC peering
10.0.1.0/24 โ€” 256 IPs, 251 usable
Reserved: .0 (net), .1 (router), .2 (DNS), .3 (AWS future), .255 (broadcast)

A secondary CIDR can be added to an existing VPC (e.g., add 100.64.0.0/16 for additional capacity without renumbering).

Public vs. Private Subnets

The distinction is purely about route table entries, not subnet attributes per se:

CharacteristicPublic SubnetPrivate Subnet
Route to Internet0.0.0.0/0 โ†’ IGWNone (or 0.0.0.0/0 โ†’ NAT GW)
Resources can be reached from internetYes (if Elastic IP / public IP assigned)No
Resources can reach internetYes (directly)Yes (via NAT Gateway)
Typical tenantsLoad balancers, bastion hosts, NAT GatewaysApp servers, containers, databases

Auto-assign public IPv4: a subnet-level setting. When enabled, new EC2 instances get a public IP automatically. Leave this OFF for private subnets.

Route Tables

Every subnet is associated with exactly one route table. The main route table is assigned by default; you can create custom route tables and associate subnets explicitly.

Route evaluation order: longest prefix match wins.

Destination        Target           Notes
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€     โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
10.0.0.0/16        local            Always present; VPC-internal traffic
0.0.0.0/0          igw-0abc123      Internet Gateway (public subnet)
pl-xxxx (S3 prefix list)  vpce-xxx  Gateway endpoint for S3 (free)

For a private subnet:

Destination        Target
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€     โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
10.0.0.0/16        local
0.0.0.0/0          nat-0abc123    NAT Gateway in public subnet of same AZ

Internet Gateway (IGW)

  • One per VPC โ€” you cannot attach multiple IGWs
  • Horizontally scaled, redundant, no bandwidth bottleneck โ€” AWS manages it
  • Performs NAT for instances with public/Elastic IPs (1:1 static NAT, not port-multiplexing)
  • Attaching an IGW does nothing on its own โ€” you must add a 0.0.0.0/0 โ†’ igw route to the subnet's route table

Elastic IPs

  • Static public IPv4 addresses allocated to your account โ€” survive instance stop/start
  • Charged $0.005/hr when NOT attached (or when attached to a stopped instance) โ€” always release unused EIPs
  • One public IP can be remapped to a different instance in seconds (useful for failover without DNS TTL)
  • Limit: 5 per region by default (can be raised)

NAT Gateway

  • Managed, highly available within a single AZ โ€” place one NAT Gateway per AZ for full HA
  • Cost: $0.045/hr (โ‰ˆ$32/month) + $0.045/GB data processing โ€” this data cost adds up fast for high-throughput workloads
  • NAT Instance: the old DIY approach on a special AMI. Deprecated for new deployments โ€” no auto-scaling, single point of failure, requires disabling source/dest check
  • NAT Gateway does NOT support inbound connections from the internet โ€” it is outbound-only
bash
# Create NAT Gateway (requires Elastic IP)
aws ec2 allocate-address --domain vpc
aws ec2 create-nat-gateway \
  --subnet-id subnet-public-1a \
  --allocation-id eipalloc-0abc123 \
  --tag-specifications 'ResourceType=natgateway,Tags=[{Key=Name,Value=nat-1a}]'

AZ Design: The 6-Subnet Pattern

For a typical 3-tier application across 2 AZs (minimum production setup):

Region us-east-1
โ””โ”€โ”€ VPC 10.0.0.0/16
    โ”œโ”€โ”€ public-1a       10.0.1.0/24   โ”€โ”€โ”
    โ”œโ”€โ”€ public-1b       10.0.2.0/24   โ”€โ”€โ”คโ”€โ”€ IGW โ”€โ”€ Internet
    โ”‚                                    โ”‚
    โ”œโ”€โ”€ private-app-1a  10.0.11.0/24 โ”€โ”€โ”€โ”€โ”€ NAT-1a (in public-1a)
    โ”œโ”€โ”€ private-app-1b  10.0.12.0/24 โ”€โ”€โ”€โ”€โ”€ NAT-1b (in public-1b)
    โ”‚
    โ”œโ”€โ”€ private-db-1a   10.0.21.0/24  (no internet route)
    โ””โ”€โ”€ private-db-1b   10.0.22.0/24  (no internet route)

Extend to 3 AZs for production (1a, 1b, 1c) = 9 subnets. Database subnets often have no internet route at all โ€” not even via NAT โ€” as a defense-in-depth measure.


2. Security Groups vs. NACLs

FeatureSecurity GroupNetwork ACL (NACL)
Applies toIndividual ENI (instance/container/LB)Entire subnet
Stateful?Yes โ€” return traffic auto-allowedNo โ€” must allow both directions
Rule typesAllow onlyAllow and Deny
EvaluationAll rules evaluated; most permissive winsRules evaluated in order (lowest number first); first match wins
Default (new)Deny all inbound; allow all outboundAllow all inbound and outbound
Default (VPC main NACL)N/AAllow all (rule 100 allow, rule * deny)
Rule limit60 inbound + 60 outbound (default)20 inbound + 20 outbound (default)
IP or SG referenceBoth IP CIDR and other SG IDsIP CIDR only
Use caseFine-grained instance-level controlSubnet-wide allow/deny; emergency IP blocking

Stateless NACLs: Ephemeral Ports

Because NACLs are stateless, return traffic for a client TCP connection uses ephemeral (client) ports 1024โ€“65535. If your NACL outbound rules only allow port 443, inbound HTTPS responses from the server will be blocked unless you also allow inbound 1024โ€“65535.

Client (ephemeral :54321) โ”€โ”€โ†’ Server :443   # inbound NACL: allow dst 443
Server :443 โ”€โ”€โ†’ Client (ephemeral :54321)   # outbound NACL: must allow src 443, dst 1024-65535

3-Tier Security Group Pattern

Internet
    โ”‚  HTTPS :443
    โ–ผ
[ALB SG]  โ€” inbound: 443 from 0.0.0.0/0
    โ”‚  HTTP :8080
    โ–ผ
[App SG]  โ€” inbound: 8080 from sg-alb-id  (SG reference, not CIDR)
    โ”‚  Postgres :5432
    โ–ผ
[DB SG]   โ€” inbound: 5432 from sg-app-id  (SG reference)

Referencing SG IDs (not CIDRs) means the rule automatically applies to all instances in that group โ€” no IP address management needed, and the rule is immune to IP changes.

bash
# Create app security group that allows traffic from ALB SG
aws ec2 authorize-security-group-ingress \
  --group-id sg-app123 \
  --protocol tcp \
  --port 8080 \
  --source-group sg-alb456  # Reference SG, not CIDR

# Create DB security group allowing only app tier
aws ec2 authorize-security-group-ingress \
  --group-id sg-db789 \
  --protocol tcp \
  --port 5432 \
  --source-group sg-app123

When to Use NACLs vs. Security Groups

Use NACLs when you need to:

  • Block a specific IP address or CIDR (security groups can only allow)
  • Apply a blanket rule to all resources in a subnet at once
  • Create an emergency deny (e.g., blocking an attacker IP without modifying every SG)

Use Security Groups for everything else โ€” they are easier to manage, stateful (less footgun potential), and support SG references.


3. VPC Connectivity

VPC Peering

VPC peering creates a direct, private network connection between two VPCs. Traffic stays on the AWS backbone โ€” no IGW, VPN, or separate hardware.

Key properties:

  • Non-transitive: Aโ†”B and Bโ†”C does NOT enable Aโ†”C. You must create a direct Aโ†”C peering.
  • Route tables on both sides must be updated with the peer's CIDR
  • No overlapping CIDRs โ€” peering fails if the VPCs have conflicting IP ranges
  • No bandwidth limit; no single point of failure
  • Cross-account and cross-region supported (inter-region traffic is encrypted by default)
bash
# Create peering connection
aws ec2 create-vpc-peering-connection \
  --vpc-id vpc-aaaa \
  --peer-vpc-id vpc-bbbb \
  --peer-region us-west-2  # omit for same-region

# Accept from the peer side (or use same account for same-region)
aws ec2 accept-vpc-peering-connection --vpc-peering-connection-id pcx-0abc123

# Add route on BOTH sides
aws ec2 create-route \
  --route-table-id rtb-aaaa \
  --destination-cidr-block 10.1.0.0/16 \
  --vpc-peering-connection-id pcx-0abc123

When to use: Small number of VPCs (< 5โ€“10); direct billing simplicity; when transitive routing is not needed.

Transit Gateway (TGW)

TGW is a regional network hub. Attach many VPCs, VPN connections, and Direct Connect gateways to a single TGW and get transitive routing โ€” A can reach C through the hub.

VPC-Prod-1 โ”€โ”€โ”
VPC-Prod-2 โ”€โ”€โ”ค
VPC-Dev    โ”€โ”€โ”คโ”€โ”€ TGW โ”€โ”€โ”€โ”€ On-Prem (via VPN or DX)
VPC-Shared โ”€โ”€โ”ค
VPC-Mgmt   โ”€โ”€โ”˜

TGW Route Tables for segmentation:

Route TableAttached VPCsCan reach
rt-prodProd-1, Prod-2, SharedEach other + Shared + On-Prem
rt-devDevShared only (isolated from Prod)
rt-sharedSharedAll (central services)
# TGW association: which route table a VPC uses for outbound lookups
# TGW propagation: which route tables learn this VPC's CIDR automatically

TGW Attachment (VPC-Dev) โ†’ associated with rt-dev
                         โ†’ propagates CIDR into rt-dev and rt-shared (not rt-prod)

Cost: $0.05/hr per attachment (VPC, VPN, DX GW) + $0.02/GB data processed. Inter-region peering: TGWs in different regions can be peered โ€” single control plane for global network.

When to use: > 5 VPCs; need transitive routing; hybrid connectivity (on-prem + cloud); environment segmentation.

PrivateLink enables private connectivity to AWS services (and your own services) without traffic leaving the AWS network.

Gateway Endpoints (S3, DynamoDB only)

  • Free โ€” no hourly or data processing charge
  • Implemented as a route table entry (prefix list โ†’ endpoint)
  • No security group; controlled via endpoint policy (IAM-style JSON)
  • Traffic stays within AWS backbone, does NOT leave to the internet via NAT
bash
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-aaaa \
  --service-name com.amazonaws.us-east-1.s3 \
  --route-table-ids rtb-private-1a rtb-private-1b \
  --vpc-endpoint-type Gateway

Interface Endpoints (all other services)

  • Creates an ENI (Elastic Network Interface) in your subnet with a private IP
  • Costs: $0.01/hr per AZ + $0.01/GB data processed
  • Has a security group โ€” controls which sources can reach the endpoint
  • Enables private DNS: resolves s3.amazonaws.com (or any service endpoint) to the private IP automatically (requires enableDnsHostnames and enableDnsSupport on the VPC)
bash
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-aaaa \
  --service-name com.amazonaws.us-east-1.ssm \
  --vpc-endpoint-type Interface \
  --subnet-ids subnet-private-1a subnet-private-1b \
  --security-group-ids sg-endpoint123 \
  --private-dns-enabled

PrivateLink for Custom Services

Expose your NLB as a VPC endpoint service โ€” other accounts create interface endpoints to reach your service without VPC peering or public internet:

Consumer VPC โ”€โ”€[Interface Endpoint]โ”€โ”€โ–บ AWS PrivateLink โ”€โ”€โ–บ Provider VPC NLB โ”€โ”€โ–บ Your app

No peering required, no overlapping CIDR concerns, consumer-controlled, cross-account. Ideal for ISV / platform team patterns.

See IAM & Security for writing endpoint policies that restrict which IAM principals and S3 bucket prefixes are accessible via the endpoint.

Site-to-Site VPN

Connects your on-premises network to a VPC over the public internet, encrypted with IPsec.

Components:

  • Virtual Private Gateway (VGW): AWS-side endpoint; attached to your VPC
  • Customer Gateway (CGW): represents your on-prem VPN device in AWS (just stores IP + routing config)
  • VPN Connection: two IPsec tunnels per connection for redundancy (each 1.25 Gbps limit)
On-Prem Router โ”€โ”€[Tunnel 1]โ”€โ”€โ–บ VGW (us-east-1a endpoint)
               โ”€โ”€[Tunnel 2]โ”€โ”€โ–บ VGW (us-east-1b endpoint)

Routing options:

  • Static: manually enter on-prem CIDRs in VGW config
  • Dynamic (BGP): preferred โ€” route propagation to route tables, automatic failover
bash
# Enable route propagation from VGW to route table
aws ec2 enable-vgw-route-propagation \
  --route-table-id rtb-private \
  --gateway-id vgw-0abc123

VPN + TGW: Attach VPN to TGW instead of VGW to share on-prem connectivity across many VPCs.

AWS Direct Connect (DX)

Dedicated physical connection from your data center to an AWS DX location โ€” bypasses the public internet entirely.

TypeBandwidthProvisioningUse Case
Dedicated1, 10, 100 GbpsWeeks (own cross-connect at DX location)Predictable high-throughput; enterprise
Hosted50 Mbps โ€“ 10 GbpsDays (via DX Partner)Faster provisioning; flexible sizing

Virtual Interfaces (VIFs):

VIF TypeConnects ToUse Case
Private VIFVGW or TGWAccess resources inside a VPC
Transit VIFTGWAccess many VPCs via Transit Gateway
Public VIFAWS public endpointsAccess S3, DynamoDB without traversing internet

High Availability with DX + VPN Failover:

Primary:   On-Prem โ”€โ”€[DX 10Gbps]โ”€โ”€โ–บ TGW   (BGP MED = 100, preferred)
Failover:  On-Prem โ”€โ”€[VPN IPsec]โ”€โ”€โ–บ TGW   (BGP MED = 200 or AS prepend)

BGP prefers lower MED โ€” DX with MED 100 wins; VPN takes over if DX fails.

Link Aggregation Group (LAG): Bundle 2โ€“4 DX connections for higher aggregate bandwidth and active/active redundancy.


4. DNS in VPC

Route 53 Resolver (VPC+2 Resolver)

Every VPC gets a DNS resolver at the VPC base IP + 2 (e.g., 10.0.0.2 for a 10.0.0.0/16 VPC). This resolver handles all DNS queries from instances in the VPC.

Two VPC settings that must BOTH be enabled for EC2 instances to get public DNS names:

SettingWhat it does
enableDnsSupportEnables the VPC+2 DNS resolver
enableDnsHostnamesAssigns public DNS hostnames to instances with public IPs

If either is disabled, ec2-xx-xx-xx-xx.compute-1.amazonaws.com won't resolve, and interface endpoint private DNS won't work.

bash
aws ec2 modify-vpc-attribute --vpc-id vpc-aaaa --enable-dns-support '{"Value":true}'
aws ec2 modify-vpc-attribute --vpc-id vpc-aaaa --enable-dns-hostnames '{"Value":true}'

Private Hosted Zones

Associate a Route 53 private hosted zone with one or more VPCs โ€” instances resolve db.internal.mycompany.com to a private IP without leaving the VPC:

bash
aws route53 create-hosted-zone \
  --name internal.mycompany.com \
  --caller-reference $(date +%s) \
  --hosted-zone-config PrivateZone=true \
  --vpc VPCRegion=us-east-1,VPCId=vpc-aaaa

Hybrid DNS: Resolver Endpoints

For hybrid environments where on-prem systems need to resolve AWS private DNS, or AWS needs to forward to on-prem DNS:

Endpoint TypeDirectionUse Case
Inbound Resolver EndpointOn-prem โ†’ AWSOn-prem DNS servers forward *.internal.aws to ENIs in VPC
Outbound Resolver EndpointAWS โ†’ On-premVPC instances query *.corp.local which forwards to on-prem DNS

Resolver Rules (attached to Outbound endpoints):

  • Forward rule: specific domain โ†’ target on-prem DNS IPs (e.g., corp.local โ†’ 192.168.1.53)
  • System rule: AWS-managed auto-defined domains (.amazonaws.com, private hosted zones)
  • Recursive rule: default for everything else โ†’ Route 53 Resolver
VPC instance: nslookup db.corp.local
  โ†’ VPC+2 resolver
  โ†’ Forward rule: corp.local โ†’ outbound endpoint
  โ†’ Outbound endpoint ENI โ†’ On-prem DNS 192.168.1.53
  โ†’ Returns 192.168.50.100 (on-prem IP)

Resolver rules can be shared via RAM (Resource Access Manager) across accounts in an organization.


5. Load Balancers

Full Comparison

FeatureALB (Application)NLB (Network)GWLB (Gateway)
OSI Layer7 (HTTP/HTTPS)4 (TCP/UDP/TLS)3+4 (IP/TCP/UDP)
ProtocolsHTTP, HTTPS, WebSocket, gRPCTCP, UDP, TLSAny IP traffic
Use caseWeb apps, microservices, API routingExtreme performance, TCP/UDP, PrivateLinkSecurity appliances (IDS/IPS, firewalls)
Static IP per AZNo (use Global Accelerator)Yes (or Elastic IP)Yes
SSL/TLS terminationYesYes (or passthrough)No (transparent)
Target typesInstance, IP, Lambda, ALBInstance, IP, ALBInstance, IP
Sticky sessionsYes (duration-based or app cookie)Yes (source IP)N/A
WAF integrationYesNoNo
Request/response modYes (redirect, fixed-response, headers)NoNo
Cross-zone load balancingOn by defaultOff by default (charge applies if on)On by default
Preserve source IPVia X-Forwarded-For headerYes (native)Yes
PrivateLink providerNoYesNo
Latency~1msSub-millisecondN/A (bump-in-the-wire)

ALB: Advanced Routing

ALB listener rules are evaluated top-to-bottom; first match wins. Supports path, host, HTTP header, query string, and source IP conditions:

yaml
# CloudFormation: ALB Listener Rules
ListenerRuleAPI:
  Type: AWS::ElasticLoadBalancingV2::ListenerRule
  Properties:
    ListenerArn: !Ref ALBListener
    Priority: 10
    Conditions:
      - Field: path-pattern
        Values: ["/api/*"]
    Actions:
      - Type: forward
        TargetGroupArn: !Ref APITargetGroup

ListenerRuleStatic:
  Type: AWS::ElasticLoadBalancingV2::ListenerRule
  Properties:
    ListenerArn: !Ref ALBListener
    Priority: 20
    Conditions:
      - Field: host-header
        Values: ["static.myapp.com"]
    Actions:
      - Type: redirect
        RedirectConfig:
          Host: "cdn.myapp.com"
          StatusCode: HTTP_301

ListenerRuleDefault:
  Type: AWS::ElasticLoadBalancingV2::ListenerRule
  Properties:
    ListenerArn: !Ref ALBListener
    Priority: 100
    Conditions:
      - Field: path-pattern
        Values: ["/*"]
    Actions:
      - Type: forward
        TargetGroupArn: !Ref WebTargetGroup

See Security Services for attaching AWS WAF to ALB to block malicious traffic at the load balancer layer before it reaches application servers.

NLB: Key Characteristics

  • Static IP per AZ โ€” critical for whitelisting by IP (e.g., partner firewall rules)
  • TLS passthrough: NLB forwards encrypted traffic without terminating; the EC2 instance handles TLS (certificate lives on server, not LB)
  • No security group on NLB itself (traffic is controlled by target security groups and source IP NACLs)
  • PrivateLink provider: expose your NLB as a private endpoint service to other VPCs/accounts

GWLB: Bump-in-the-Wire

GWLB inserts security appliances transparently into traffic flows using Geneve encapsulation (port 6081):

Client
  โ”‚
  โ–ผ
GWLB (receives packet)
  โ”‚  Geneve encapsulation
  โ–ผ
Security Appliance (IDS/IPS/firewall โ€” inspects and optionally drops)
  โ”‚  Returns packet to GWLB
  โ–ผ
GWLB (strips encapsulation, forwards original packet)
  โ”‚
  โ–ผ
Destination EC2

Route traffic through GWLB via route table entries โ€” 0.0.0.0/0 โ†’ GWLB endpoint. See Compute & Containers for EKS/ECS networking considerations when deploying to VPCs with GWLB in the path.


6. VPC Flow Logs & Network Observability

Flow Logs

Flow logs capture metadata about IP traffic going to and from network interfaces. They do NOT capture the payload โ€” just the 5-tuple + metadata.

Default fields captured:

version account-id interface-id srcaddr dstaddr srcport dstport protocol
packets bytes start end action log-status

Version 5 custom format adds critical context:

${vpc-id} ${subnet-id} ${instance-id} ${tcp-flags} ${pkt-srcaddr} ${pkt-dstaddr}
${flow-direction} ${traffic-path}

pkt-srcaddr/pkt-dstaddr show the original source/destination before NAT translation (useful for tracing through NAT Gateways).

Destinations:

DestinationQuery ToolLatencyCost
CloudWatch LogsCloudWatch Logs InsightsNear real-timeHigher (CW ingestion + storage)
S3Athena + Glue5โ€“15 min delayLower (S3 storage only)
Kinesis Data FirehoseReal-time processing (Splunk, OpenSearch)Near real-timeFirehose + destination
bash
# Enable flow logs to S3
aws ec2 create-flow-logs \
  --resource-type VPC \
  --resource-ids vpc-aaaa \
  --traffic-type ALL \
  --log-destination-type s3 \
  --log-destination arn:aws:s3:::my-flow-logs-bucket/vpc-logs/ \
  --log-format '${version} ${vpc-id} ${subnet-id} ${instance-id} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${protocol} ${packets} ${bytes} ${start} ${end} ${action} ${tcp-flags}'

Athena Query: Top Talkers and Blocked Connections

sql
-- Top 10 source IPs by bytes transferred
SELECT srcaddr, SUM(bytes) AS total_bytes
FROM vpc_flow_logs
WHERE action = 'ACCEPT'
  AND start >= to_unixtime(current_timestamp - interval '1' hour)
GROUP BY srcaddr
ORDER BY total_bytes DESC
LIMIT 10;

-- Find REJECTED connections (potential attack or misconfiguration)
SELECT srcaddr, dstaddr, dstport, protocol, COUNT(*) AS reject_count
FROM vpc_flow_logs
WHERE action = 'REJECT'
  AND start >= to_unixtime(current_timestamp - interval '1' hour)
GROUP BY srcaddr, dstaddr, dstport, protocol
ORDER BY reject_count DESC
LIMIT 20;

-- Cross-AZ traffic (identify cost sources)
SELECT srcaddr, dstaddr, SUM(bytes) AS bytes
FROM vpc_flow_logs
WHERE subnet_id != dst_subnet_id  -- simplified; filter by known subnet CIDRs
GROUP BY srcaddr, dstaddr
ORDER BY bytes DESC;

Reachability Analyzer

Automated path analysis โ€” you specify a source and destination, AWS traces the logical path and reports exactly which component (SG rule, NACL, route table, peering) blocks connectivity:

bash
aws ec2 create-network-insights-path \
  --source eni-source123 \
  --destination eni-dest456 \
  --protocol tcp \
  --destination-port 443

aws ec2 start-network-insights-analysis \
  --network-insights-path-id nip-0abc123

Returns: REACHABLE or a precise explanation of the blocking component. Invaluable for debugging "why can't my Lambda reach RDS?" without manually auditing every SG and route table.

Network Access Analyzer

Proactively audits your entire VPC topology to find unintended access paths โ€” e.g., finding any path from the internet to your RDS instances that you didn't explicitly intend. Runs without needing a specific source/destination pair.

Traffic Mirroring

Copy raw packet traffic from ENIs to an inspection appliance (an NLB, Packet Capture instance). Use for:

  • Deep packet inspection (DPI)
  • IDS/IPS running on EC2
  • Forensic analysis of suspected compromised instances

Only available on Nitro-based instances.


7. High Availability Patterns

NAT Gateway HA: Per-AZ Design

Anti-pattern (single NAT Gateway):

AZ-1: private-app-1a โ†’ NAT-1a (public-1a) โ†’ Internet   โœ“
AZ-2: private-app-1b โ†’ NAT-1a (public-1a) โ†’ Internet   โœ— cross-AZ traffic
                                                           + single point of failure

Correct pattern (NAT Gateway per AZ):

AZ-1: private-app-1a โ†’ NAT-1a (public-1a) โ†’ Internet   โœ“ same-AZ, HA
AZ-2: private-app-1b โ†’ NAT-1b (public-1b) โ†’ Internet   โœ“ same-AZ, HA

Route tables must be AZ-specific โ€” rtb-private-1a routes 0.0.0.0/0 to nat-1a, and rtb-private-1b routes to nat-1b. If AZ-1 fails, AZ-2 traffic is unaffected.

Cost vs. HA trade-off: Two NAT Gateways = ~$64/month vs. ~$32/month. For production, always pay for per-AZ NAT Gateways.

Multi-AZ Subnet Design Matrix

ResourceAZ-1 SubnetAZ-2 SubnetRoute TableNotes
ALBpublic-1apublic-1bmain (โ†’ IGW)Must span 2+ AZs
NAT Gatewaypublic-1apublic-1bN/AOne per AZ
ECS/EKS tasksprivate-app-1aprivate-app-1bprivate (โ†’ NAT same AZ)Register in both AZs
RDS Primaryprivate-db-1aโ€”private (no internet)โ€”
RDS Standbyโ€”private-db-1bprivate (no internet)Multi-AZ failover
ElastiCacheprivate-db-1aprivate-db-1bprivate (no internet)Cluster mode spans AZs

TGW HA

TGW attachments are multi-AZ by default โ€” specify subnets in each AZ when attaching a VPC. TGW itself is a regional service with AWS-managed HA. Use separate route tables per environment:

TGW Attachment for VPC-Prod โ†’ subnets: private-1a, private-1b (both AZs)
TGW Route Table rt-prod:
  10.0.0.0/16 โ†’ attachment-prod-1
  10.1.0.0/16 โ†’ attachment-prod-2
  10.99.0.0/16 โ†’ attachment-shared
  0.0.0.0/0 โ†’ attachment-firewall (centralized egress inspection)

Direct Connect HA

For mission-critical workloads:

Tier 1 (highest priority): 2ร— Dedicated DX from 2 different DX locations (different providers)
Tier 2 (failover):         VPN over internet as final fallback

BGP routing: DX routes have lower MED โ†’ preferred. VPN routes have higher MED โ†’ used only when DX is down. See CloudFront & Route 53 for global traffic management layered on top of hybrid connectivity.


8. Cost Optimization

NAT Gateway: The Hidden Cost

NAT Gateway data processing at $0.045/GB is the most common VPC budget surprise. Strategies to reduce it:

1. Gateway Endpoints for S3 and DynamoDB (free)

Before: EC2 โ†’ NAT Gateway ($0.045/GB) โ†’ S3 After: EC2 โ†’ S3 Gateway Endpoint (free) โ†’ S3

If your workloads read/write heavily from S3 (data pipelines, ML training, log archival), this single change can save hundreds per month.

2. Interface Endpoints for other AWS services

Break-even calculation:

NAT Gateway data cost:      N GB/month ร— $0.045
Interface Endpoint cost:    $0.01/hr ร— 2 AZs ร— 730 hrs + N GB ร— $0.01
                          = $14.60/month base + N ร— $0.01

Break even: N ร— $0.045 = $14.60 + N ร— $0.01
            N ร— $0.035 = $14.60
            N โ‰ˆ 417 GB/month

For > 417 GB/month through NAT to a single service, Interface Endpoint saves money.

3. Identify cross-AZ traffic with Flow Logs

Cross-AZ data transfer costs $0.01/GB each direction. Use Athena queries on flow logs to identify top cross-AZ talkers. Common culprit: an EC2 instance in AZ-1 connecting to an ElastiCache node or RDS read replica in AZ-2.

Fixes:

  • Use ElastiCache cluster mode with replicas in each AZ; configure client to prefer local AZ node
  • Use RDS read replicas per AZ; route read queries to same-AZ replica
  • For ECS/EKS, use topologySpreadConstraints or AZ affinity to keep caller and callee in same AZ

4. VPN vs. Direct Connect

ScenarioRecommendation
< 500 GB/month data transfer, variableVPN ($0.05/hr/tunnel + data)
> 1 TB/month data transfer, consistentDX (better data pricing, higher base cost)
Low latency required (< 10ms)DX (dedicated, no internet congestion)
Fast provisioning neededVPN (minutes) vs DX (weeks)

See Cost Optimization for a full AWS cost optimization framework, reserved capacity strategies, and cost anomaly detection patterns.


Interview Q&A


Q: What's the difference between a Security Group and a NACL?

A: Security Groups are stateful, instance-level firewalls that only support allow rules. Return traffic is automatically permitted regardless of outbound rules. NACLs are stateless, subnet-level ACLs that support both allow and deny rules; you must explicitly allow both inbound and outbound traffic (including ephemeral ports 1024โ€“65535 for return traffic). In practice, use Security Groups for all normal access control and NACLs for emergency IP blocking or an extra layer of subnet-wide policy. NACLs evaluate rules in order โ€” the first match wins โ€” which is a common gotcha when troubleshooting.


Q: Explain VPC peering vs. Transit Gateway โ€” when do you use each?

A: VPC peering is a direct, point-to-point private connection between two VPCs. It has no bandwidth limit and no per-hour charge but is non-transitive (you need Nร—(N-1)/2 peering connections for full mesh). Transit Gateway is a regional hub that enables transitive routing โ€” any attached VPC or VPN can reach any other attached network through the hub. Use peering for 2โ€“5 VPCs with simple topology; use TGW when you have many VPCs, need on-prem connectivity shared across them, or need network segmentation via separate TGW route tables (e.g., prod vs. dev isolation).


Q: How does DNS resolution work in a VPC?

A: Every VPC has a built-in DNS resolver at VPC base + 2 (e.g., 10.0.0.2). Instances send all DNS queries there. For public DNS, it forwards to Route 53 public resolvers. For private hosted zones associated with the VPC, it returns the private IP. Two VPC attributes must both be true: enableDnsSupport (enables the resolver) and enableDnsHostnames (assigns EC2 public DNS names). For hybrid scenarios, you deploy Route 53 Resolver inbound endpoints (so on-prem can query AWS DNS) and outbound endpoints with forwarding rules (so VPC instances forward corp.local queries to on-prem DNS servers).


Q: A Lambda function in a VPC can't reach the internet. What are the likely causes and fixes?

A: The most common causes in order of likelihood:

  1. No NAT Gateway โ€” Lambda in a VPC gets a private IP from the subnet, not a public IP. It needs a NAT Gateway in a public subnet with a route 0.0.0.0/0 โ†’ NAT GW in the private subnet's route table.
  2. Lambda is in a public subnet โ€” counterintuitive but: even in a public subnet, Lambda ENIs don't get public IPs. Must use private subnet + NAT.
  3. Security Group blocks outbound โ€” check that the Lambda's SG allows outbound 443/80 to 0.0.0.0/0.
  4. NACL blocks outbound or inbound return โ€” verify NACL allows outbound 443 and inbound 1024โ€“65535 (ephemeral ports).
  5. Missing VPC endpoint โ€” if the Lambda only needs to reach AWS services (S3, Secrets Manager, SSM), use VPC endpoints instead; cheaper and more reliable than routing through NAT.

See Lambda for the full Lambda VPC cold start and ENI attachment mechanics.


Q: How do you design VPC networking for a 3-tier web application across 2 AZs?

A: Use a /16 VPC with 6 subnets across 2 AZs: public-1a/1b (10.0.1.0/24, 10.0.2.0/24), private-app-1a/1b (10.0.11.0/24, 10.0.12.0/24), private-db-1a/1b (10.0.21.0/24, 10.0.22.0/24). Public subnets route 0.0.0.0/0 to an IGW and host the ALB and one NAT Gateway per AZ. Private app subnets route to their AZ's NAT Gateway and host ECS/EKS tasks. Private DB subnets have no internet route (not even NAT) and host RDS Multi-AZ. Three security groups: ALB SG (allows 443 from 0.0.0.0/0), app SG (allows 8080 from ALB SG), DB SG (allows 5432 from app SG). Gateway endpoints for S3 and DynamoDB eliminate NAT costs for those services.


Q: What's AWS PrivateLink and when would you use it over VPC peering?

A: PrivateLink exposes a service (backed by an NLB) as a private endpoint that consumers connect to via an Interface Endpoint in their own VPC. Traffic never crosses the public internet, and the consumer doesn't need peering or overlapping-CIDR awareness. Use PrivateLink when: (1) you want to expose a service to many consumers without granting full VPC access that peering implies; (2) the service provider and consumer have overlapping CIDRs (peering can't be used); (3) you're building a SaaS platform and want tenant isolation; (4) you need to cross organizational boundaries cleanly. PrivateLink is one-directional (consumer can reach the service, not vice versa). Peering is simpler for full bidirectional VPC-to-VPC connectivity within a small, trusted network.


Q: How would you reduce data transfer costs in a multi-AZ architecture?

A: Three levers: First, add Gateway endpoints for S3 and DynamoDB โ€” free, removes NAT Gateway data processing costs for those services (often the biggest win). Second, analyze cross-AZ traffic with VPC Flow Logs + Athena โ€” find top cross-AZ talkers and co-locate them or add per-AZ replicas for caches/databases. Third, audit Interface Endpoint break-even: if a service consumes more than ~417 GB/month through NAT, an Interface Endpoint is cheaper despite its $14.60/month base cost. Also review Data Transfer pricing: same-AZ traffic is free; cross-AZ is $0.01/GB each way; internet egress is $0.09/GB. See Cost Optimization for comprehensive cost management patterns.


Q: Design a hybrid network with HA connecting an on-premises data center to AWS.

A: Production HA hybrid design:

On-prem DC (primary location)
  โ”œโ”€โ”€ DX 10 Gbps (DX Location A) โ”€โ”€[Private VIF]โ”€โ”€โ–บ TGW (us-east-1)
  โ””โ”€โ”€ VPN IPsec (backup)         โ”€โ”€[VPN attach]โ”€โ”€โ–บ TGW

On-prem DC (DR location, optional)
  โ””โ”€โ”€ DX 10 Gbps (DX Location B) โ”€โ”€[Private VIF]โ”€โ”€โ–บ TGW (us-east-1)

BGP: DX routes advertised with lower MED (100), VPN with higher MED (200). TGW propagates routes to all attached VPCs. For multi-region: inter-region TGW peering with route tables that permit cross-region traffic selectively. DX uses a LAG (Link Aggregation Group) at each location for additional redundancy. On-prem DNS: Route 53 Outbound Resolver Endpoints forward corp.local to on-prem; Inbound Endpoints let on-prem resolve aws.internal. See CloudFront & Route 53 for global traffic distribution layered on top of this hybrid topology.


Q: How does a Gateway Endpoint differ from an Interface Endpoint?

A: Gateway endpoints (S3 and DynamoDB only) are implemented as route table entries โ€” a prefix list pointing to the endpoint, not an ENI. They are completely free (no hourly charge, no data charge) and have no security group. Interface endpoints (all other services, including custom PrivateLink services) create an ENI in your subnet with a private IP. They cost $0.01/hr per AZ plus $0.01/GB. They have their own security group and support private DNS (the service's public DNS name resolves to the private ENI IP inside the VPC). Interface endpoints work across VPN and Direct Connect; Gateway endpoints do not (on-prem hosts must use Interface endpoints to reach S3 privately).


Q: What happens to traffic routing if a NAT Gateway in one AZ fails?

A: If you have one shared NAT Gateway: all private subnets route 0.0.0.0/0 to it. If it fails, all internet-bound traffic from all private subnets drops. Recovery requires creating a new NAT Gateway and updating route tables โ€” not automatic.

If you have per-AZ NAT Gateways (correct pattern): each AZ's private subnets route to their local NAT Gateway. If AZ-1's NAT fails, only AZ-1 private subnets lose internet egress. AZ-2 is unaffected. You can manually update AZ-1's route table to point to AZ-2's NAT Gateway as a workaround (incurring cross-AZ transfer costs), but AWS recommends accepting the AZ-level failure rather than creating cross-AZ NAT dependency. The correct production response is fixing the failed AZ (NAT Gateways themselves rarely fail โ€” usually the AZ itself has an outage) or failing over at the application layer.


Q: Explain how Transit Gateway route tables enable network segmentation.

A: Each TGW attachment (VPC, VPN, DX) is associated with exactly one TGW route table (which it uses for outbound route lookups) and can propagate its CIDR into multiple route tables. This creates fine-grained segmentation:

TGW rt-prod:   propagates from VPC-Prod-1, VPC-Prod-2, VPC-Shared, On-Prem-VPN
               โ†’ prod VPCs + shared services + on-prem can talk to each other

TGW rt-dev:    propagates from VPC-Dev, VPC-Shared
               โ†’ dev can reach shared services but NOT prod VPCs
               โ†’ on-prem can optionally be excluded

TGW rt-shared: propagates from all VPCs
               โ†’ shared services (DNS, monitoring, artifact repos) are reachable from everywhere

The key mechanic: propagation controls which CIDRs appear in a route table; association controls which route table an attachment consults for outbound traffic. A VPC associated with rt-dev only sees routes in rt-dev โ€” it has no visibility into rt-prod's routes even though those VPCs are attached to the same TGW. See AWS Architecture for multi-account, multi-VPC reference architectures using TGW as the network backbone.


Red Flags to Avoid

  • Single NAT Gateway for all AZs โ€” saves ~$32/month, costs you an outage when the AZ or NAT has an issue
  • Putting Lambda in a public subnet expecting internet access โ€” Lambda ENIs never get public IPs; always use private subnet + NAT
  • Forgetting NACL ephemeral ports โ€” if you tighten NACLs, you must allow inbound 1024โ€“65535 for return traffic or all TCP connections silently fail
  • Overlapping CIDRs when planning peering or TGW โ€” impossible to fix later without re-IP-ing; plan your CIDR allocation before provisioning
  • 0.0.0.0/0 security group rules on internal services โ€” app and DB tiers should never have 0.0.0.0/0 inbound; always reference the upstream SG ID
  • Not using VPC Gateway Endpoints for S3/DynamoDB โ€” free to add, immediate NAT cost reduction; there is no reason not to have them
  • Enabling enableDnsHostnames without enableDnsSupport โ€” the DNS resolver does not work; VPC endpoint private DNS breaks silently
  • VPC peering without updating both route tables โ€” the connection is established but traffic is one-way; always update both sides
  • Not propagating BGP routes from VGW/TGW โ€” static routes in route tables don't auto-update when on-prem prefixes change; enable route propagation
  • Attaching IGW but not adding it to route table โ€” IGW attachment is a prerequisite, not sufficient; the subnet's route table needs 0.0.0.0/0 โ†’ igw
  • Using NACLs as a primary security control โ€” they are stateless, order-dependent, and lack SG-reference capability; reserve them for emergency blocking
  • Ignoring cross-AZ data transfer costs โ€” for data-heavy workloads, cross-AZ traffic at $0.01/GB each way adds up faster than EC2 or RDS costs
  • Direct Connect without VPN backup โ€” DX is reliable but not infallible; always provision a VPN tunnel as a BGP failover path
Next โ†’The AI Engineer's Roadmap: Skills, Tools & Career Path (2025+)Up next: ๐Ÿงญ Phase 0 ยท The AI Engineer on the Edge