AI Infrastructure Lab

A fully functional AI demonstration environment running 8 services across three isolated Docker networks. Every video in the series uses this lab for live demonstrations of security controls, governance policies, and monitoring configurations.

Three-Zone Architecture

Network isolation mirrors the Reference Architecture's three-zone topology — serving, management, and security networks are isolated by design.

Serving Network

Inference, RAG pipeline, API gateway. Handles all external-facing traffic and model serving.

Ollama:11434

ChromaDB:8000

LangChain App:8001

Nginx:8080

Prometheus:9090

OPA:8181

Management Network

Monitoring, model registry, dashboards. Internal operations and observability.

Nginx:8080

MLflow:5000

Prometheus:9090

Grafana:3000

Security Network

Policy engine and scanning. Enforces governance controls across all zones.

OPA:8181

Service	Container	Purpose	Network	Port
Ollama	lab-ollama	Local LLM inference (Llama 3.2 3B)	Serving	11434
ChromaDB	lab-chromadb	Vector database for RAG retrieval	Serving	8000
LangChain App	lab-langchain	RAG pipeline orchestration	Serving	8001
Nginx	lab-nginx	API gateway with rate limiting	Serving + Management	8080
MLflow	lab-mlflow	Model registry and experiment tracking	Management	5000
Prometheus	lab-prometheus	Metrics collection and alerting	Management + Serving	9090
Grafana	lab-grafana	Monitoring dashboards and visualization	Management	3000
OPA	lab-opa	Policy engine for governance-as-code	Security + Serving	8181

What's Running

The full AI infrastructure stack, from inference to monitoring to policy enforcement.

RAG Pipeline

Ollama serves Llama 3.2 3B for local inference. ChromaDB provides vector storage. LangChain orchestrates retrieval-augmented generation. Nginx acts as the API gateway with rate limiting and security logging.

Monitoring Stack

Prometheus scrapes inference metrics from the RAG pipeline. Grafana renders dashboards for latency, throughput, and error rates. MLflow provides model registry and experiment tracking.

Governance Engine

Open Policy Agent enforces governance-as-code policies written in Rego. Default deny with risk-tiered approval requirements. Demonstrates automated deployment gates for model governance.

Security Controls

All containers run as non-root users. Three-network zone isolation prevents lateral movement between serving and management zones. Trivy scans container images for vulnerabilities.

Getting Started

Run the full lab environment locally with Docker Compose.

Prerequisites

—Docker Desktop installed
—NVIDIA GPU with drivers (for GPU inference)
—NVIDIA Container Toolkit configured
—At least 24 GB RAM, 50 GB disk space

Launch Commands

# Start the lab

docker compose up -d

# Pull the model

docker exec lab-ollama ollama pull llama3.2:3b

# Verify all services

docker compose ps