AI Infrastructure Lab
A fully functional AI demonstration environment running 8 services across three isolated Docker networks. Every video in the series uses this lab for live demonstrations of security controls, governance policies, and monitoring configurations.
Three-Zone Architecture
Network isolation mirrors the Reference Architecture's three-zone topology — serving, management, and security networks are isolated by design.
Serving Network
Inference, RAG pipeline, API gateway. Handles all external-facing traffic and model serving.
Management Network
Monitoring, model registry, dashboards. Internal operations and observability.
Security Network
Policy engine and scanning. Enforces governance controls across all zones.
| Service | Container | Purpose | Network | Port |
|---|---|---|---|---|
| Ollama | lab-ollama | Local LLM inference (Llama 3.2 3B) | Serving | 11434 |
| ChromaDB | lab-chromadb | Vector database for RAG retrieval | Serving | 8000 |
| LangChain App | lab-langchain | RAG pipeline orchestration | Serving | 8001 |
| Nginx | lab-nginx | API gateway with rate limiting | Serving + Management | 8080 |
| MLflow | lab-mlflow | Model registry and experiment tracking | Management | 5000 |
| Prometheus | lab-prometheus | Metrics collection and alerting | Management + Serving | 9090 |
| Grafana | lab-grafana | Monitoring dashboards and visualization | Management | 3000 |
| OPA | lab-opa | Policy engine for governance-as-code | Security + Serving | 8181 |
What's Running
The full AI infrastructure stack, from inference to monitoring to policy enforcement.
RAG Pipeline
Ollama serves Llama 3.2 3B for local inference. ChromaDB provides vector storage. LangChain orchestrates retrieval-augmented generation. Nginx acts as the API gateway with rate limiting and security logging.
Monitoring Stack
Prometheus scrapes inference metrics from the RAG pipeline. Grafana renders dashboards for latency, throughput, and error rates. MLflow provides model registry and experiment tracking.
Governance Engine
Open Policy Agent enforces governance-as-code policies written in Rego. Default deny with risk-tiered approval requirements. Demonstrates automated deployment gates for model governance.
Security Controls
All containers run as non-root users. Three-network zone isolation prevents lateral movement between serving and management zones. Trivy scans container images for vulnerabilities.
Getting Started
Run the full lab environment locally with Docker Compose.
Prerequisites
- —Docker Desktop installed
- —NVIDIA GPU with drivers (for GPU inference)
- —NVIDIA Container Toolkit configured
- —At least 24 GB RAM, 50 GB disk space