AI Infrastructure Lab

A fully functional AI demonstration environment running 8 services across three isolated Docker networks. Every video in the series uses this lab for live demonstrations of security controls, governance policies, and monitoring configurations.

Three-Zone Architecture

Network isolation mirrors the Reference Architecture's three-zone topology — serving, management, and security networks are isolated by design.

Serving Network

Inference, RAG pipeline, API gateway. Handles all external-facing traffic and model serving.

Ollama:11434
ChromaDB:8000
LangChain App:8001
Nginx:8080
Prometheus:9090
OPA:8181

Management Network

Monitoring, model registry, dashboards. Internal operations and observability.

Nginx:8080
MLflow:5000
Prometheus:9090
Grafana:3000

Security Network

Policy engine and scanning. Enforces governance controls across all zones.

OPA:8181
ServiceContainerPurposeNetworkPort
Ollamalab-ollamaLocal LLM inference (Llama 3.2 3B)Serving11434
ChromaDBlab-chromadbVector database for RAG retrievalServing8000
LangChain Applab-langchainRAG pipeline orchestrationServing8001
Nginxlab-nginxAPI gateway with rate limitingServing + Management8080
MLflowlab-mlflowModel registry and experiment trackingManagement5000
Prometheuslab-prometheusMetrics collection and alertingManagement + Serving9090
Grafanalab-grafanaMonitoring dashboards and visualizationManagement3000
OPAlab-opaPolicy engine for governance-as-codeSecurity + Serving8181

What's Running

The full AI infrastructure stack, from inference to monitoring to policy enforcement.

RAG Pipeline

Ollama serves Llama 3.2 3B for local inference. ChromaDB provides vector storage. LangChain orchestrates retrieval-augmented generation. Nginx acts as the API gateway with rate limiting and security logging.

Monitoring Stack

Prometheus scrapes inference metrics from the RAG pipeline. Grafana renders dashboards for latency, throughput, and error rates. MLflow provides model registry and experiment tracking.

Governance Engine

Open Policy Agent enforces governance-as-code policies written in Rego. Default deny with risk-tiered approval requirements. Demonstrates automated deployment gates for model governance.

Security Controls

All containers run as non-root users. Three-network zone isolation prevents lateral movement between serving and management zones. Trivy scans container images for vulnerabilities.

Getting Started

Run the full lab environment locally with Docker Compose.

Prerequisites

  • Docker Desktop installed
  • NVIDIA GPU with drivers (for GPU inference)
  • NVIDIA Container Toolkit configured
  • At least 24 GB RAM, 50 GB disk space

Launch Commands

# Start the lab
docker compose up -d
# Pull the model
docker exec lab-ollama ollama pull llama3.2:3b
# Verify all services
docker compose ps