Docker & AWS Bedrock
Containerisation · ECS Fargate · AWS Bedrock LLMs · CI/CD pipeline · Production deployment
Docker Fundamentals
| Concept | What it is |
|---|---|
| Image | Immutable blueprint — layers of filesystem changes, built from Dockerfile |
| Container | Running instance of an image — isolated process with its own filesystem, network, PID space |
| Registry | Store for images — Docker Hub, AWS ECR, GitHub Container Registry |
| Layer cache | Each Dockerfile instruction = one layer. Unchanged layers are cached — build only what changed |
| Multi-stage build | Use a build image for compilation, copy only artifacts to lean runtime image |
Production Dockerfile for FastAPI + LangGraph
# Stage 1: build dependencies FROM python:3.11-slim AS builder WORKDIR /build COPY requirements.txt . RUN pip install --user --no-cache-dir -r requirements.txt # Stage 2: lean runtime image FROM python:3.11-slim WORKDIR /app # Non-root user for security RUN groupadd -r appuser && useradd -r -g appuser appuser # Copy installed packages from builder COPY --from=builder /root/.local /home/appuser/.local COPY --chown=appuser:appuser . . USER appuser ENV PATH=/home/appuser/.local/bin:$PATH ENV PYTHONUNBUFFERED=1 EXPOSE 8000 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "2"]
Order Dockerfile layers correctly
Put COPY requirements.txt and pip install BEFORE copying source code. Dependencies change rarely — source code changes on every commit. This maximises layer cache hits and keeps CI builds fast.
Docker Compose for Local Dev
version: "3.9"
services:
api:
build: .
ports: ["8000:8000"]
environment:
- DATABASE_URL=postgresql://user:pass@db:5432/loaniq
- REDIS_URL=redis://redis:6379
- AWS_REGION=ap-south-1
depends_on:
db: { condition: service_healthy }
redis: { condition: service_started }
volumes:
- ./:/app # hot-reload in dev
db:
image: pgvector/pgvector:pg16
environment: { POSTGRES_DB: loaniq, POSTGRES_USER: user, POSTGRES_PASSWORD: pass }
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user -d loaniq"]
interval: 5s
retries: 5
redis:
image: redis:7-alpine
AWS ECS Fargate Deployment
Fargate runs containers without managing EC2 instances. Key concepts:
| Component | Purpose |
|---|---|
| ECS Cluster | Logical grouping of services |
| Task Definition | Blueprint: image URI, CPU/memory, env vars, IAM role |
| Service | Keeps N tasks running, handles rolling deploys |
| ECR | Private Docker registry — stores your images in AWS |
| Task IAM Role | Grants the running container permissions (S3, Bedrock, SQS...) |
CI/CD Pipeline — GitHub Actions → ECR → ECS
# .github/workflows/deploy.yml
name: Deploy to ECS
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ap-south-1
- name: Login to ECR
run: aws ecr get-login-password | docker login --username AWS --password-stdin ${{ secrets.ECR_REGISTRY }}
- name: Build and push image
run: |
docker build -t ${{ secrets.ECR_REGISTRY }}/loaniq:$GITHUB_SHA .
docker push ${{ secrets.ECR_REGISTRY }}/loaniq:$GITHUB_SHA
- name: Update ECS service
run: |
aws ecs update-service --cluster loaniq-cluster --service loaniq-service --force-new-deployment
AWS Bedrock
Bedrock is a managed API for foundation models — Claude (Anthropic), Titan (Amazon), Llama (Meta), Mistral — without running any model infrastructure. All within the AWS VPC boundary, so data never leaves your AWS account.
Invoking Claude via Bedrock
import boto3, json
bedrock = boto3.client("bedrock-runtime", region_name="ap-south-1")
response = bedrock.invoke_model(
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Summarise this loan application: ..."}
]
}),
contentType="application/json",
accept="application/json"
)
result = json.loads(response["body"].read())
text = result["content"][0]["text"]
Streaming response from Bedrock
response = bedrock.invoke_model_with_response_stream(
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
body=json.dumps({...})
)
stream = response["body"]
for event in stream:
chunk = json.loads(event["chunk"]["bytes"])
if chunk["type"] == "content_block_delta":
print(chunk["delta"]["text"], end="", flush=True)
LangChain + Bedrock Integration
from langchain_aws import ChatBedrock
llm = ChatBedrock(
model_id="anthropic.claude-3-sonnet-20240229-v1:0",
region_name="ap-south-1",
model_kwargs={"max_tokens": 2048, "temperature": 0},
)
# Drop-in replacement for ChatOpenAI
chain = prompt | llm | parser
IAM Permissions for Bedrock
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:ap-south-1::foundation-model/anthropic.claude-3-sonnet*"
]
}
Common Interview Questions
Q: Why Fargate over EC2 for LLM workloads?
Fargate has no instance management overhead — no patching, no capacity reservations. For API services that scale in and out, Fargate is simpler. The trade-off: can't run GPU workloads on Fargate (use EC2 with GPU instances for self-hosted models). Since LoanIQ uses Bedrock for inference, Fargate is the right choice.
Q: How do you manage secrets in containers?
Never hardcode secrets in images. Use AWS Secrets Manager — inject secret ARNs as environment variables in the ECS task definition. ECS fetches the secret at startup. For local dev, use a .env file (gitignored) with docker-compose env_file.
Q: How do you size Fargate tasks for a RAG service?
Profile CPU and memory under load. RAG services are mostly I/O-bound (Bedrock API calls, DB queries) — 1 vCPU and 2GB RAM handles moderate traffic. Add CPU/RAM if you run local reranking models (cross-encoder) in the container. Set ECS auto-scaling on CPU utilisation at 70%.