SMP CI/CD Pipeline¶
Tooling: GitHub Actions + ArgoCD + Helm · Audience: DevOps, Backend
1. Pipeline overview¶
Code push (feature branch)
│
▼
┌──────────────┐
│ GH Actions │ CI: lint → test → build → scan
│ (PR pipeline)│
└──────┬───────┘
│ PR approved + merged
▼
┌──────────────┐
│ GH Actions │ CD: build image → push to registry → update GitOps repo
│ (main branch)│
└──────┬───────┘
│ commit to gitops-deploys repo
▼
┌──────────────┐
│ ArgoCD │ Sync: pull manifest → apply to K8s cluster
│ (auto-sync) │
└──────┬───────┘
│
▼
Dev env updated
│ manual promote
▼
Staging env updated (after manual approve)
│ manual promote
▼
Prod env updated (after release manager approve)
2. Branch & deploy mapping¶
| Branch | Trigger | Deploys to | Auto/Manual |
|---|---|---|---|
feat/*, fix/* |
PR open | (CI only, no deploy) | Auto CI |
main |
Merge | dev cluster | Auto |
Tag v*.*.*-rc.N |
Tag push | staging cluster | Auto |
Tag v*.*.* |
Tag push | prod cluster | Manual approve |
3. CI: PR pipeline (.github/workflows/ci.yml)¶
name: CI
on:
pull_request:
branches: [main]
push:
branches: [main]
env:
GO_VERSION: "1.22"
jobs:
lint:
name: Lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: true
- name: gofmt check
run: |
fmt_files=$(gofmt -l .)
if [ -n "$fmt_files" ]; then
echo "Unformatted files: $fmt_files"
exit 1
fi
- name: go vet
run: go vet ./...
- uses: golangci/golangci-lint-action@v3
with:
version: latest
args: --timeout=5m
test:
name: Unit + integration tests
runs-on: ubuntu-latest
services:
mysql:
image: mysql:8.0
env:
MYSQL_ROOT_PASSWORD: test
MYSQL_DATABASE: smp_test
ports: ["3306:3306"]
options: >-
--health-cmd "mysqladmin ping" --health-interval 10s --health-timeout 5s --health-retries 5
redis:
image: redis:7
ports: ["6379:6379"]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: true
- name: Run migrations
run: |
go install github.com/golang-migrate/migrate/v4/cmd/migrate@latest
migrate -database "mysql://root:test@tcp(localhost:3306)/smp_test" -path migrations up
- name: Unit tests
run: go test -race -coverprofile=coverage.out ./...
- name: Integration tests
run: go test -tags=integration ./test/integration/...
env:
MYSQL_DSN: root:test@tcp(localhost:3306)/smp_test?parseTime=true
REDIS_ADDR: localhost:6379
- name: Coverage check
run: |
cov=$(go tool cover -func=coverage.out | grep total | awk '{print $3}' | sed 's/%//')
if (( $(echo "$cov < 60" | bc -l) )); then
echo "Coverage $cov% below threshold 60%"
exit 1
fi
- uses: codecov/codecov-action@v4
with: { file: coverage.out }
security:
name: Security scans
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
- name: govulncheck
run: |
go install golang.org/x/vuln/cmd/govulncheck@latest
govulncheck ./...
- name: Trivy filesystem scan
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
severity: 'HIGH,CRITICAL'
exit-code: '1'
openapi:
name: OpenAPI validation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Spectral lint
uses: stoplightio/spectral-action@latest
with:
file_glob: "api/openapi.yaml"
- name: Breaking change check (vs main)
run: |
npx -y @apidevtools/swagger-cli validate api/openapi.yaml
4. CD: Build & deploy (.github/workflows/cd.yml)¶
name: CD
on:
push:
branches: [main]
tags: ["v*.*.*"]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
outputs:
image_tag: ${{ steps.meta.outputs.version }}
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=sha,format=short
type=semver,pattern={{version}}
- uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
provenance: true
sbom: true
deploy-dev:
needs: build
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
repository: trungnguyenchanh/smp-gitops
token: ${{ secrets.GITOPS_PAT }}
path: gitops
- name: Update image tag in dev manifest
run: |
cd gitops/environments/dev
yq e -i ".image.tag = \"${{ needs.build.outputs.image_tag }}\"" ${{ github.event.repository.name }}.yaml
git config user.email "ci@smp.vn"
git config user.name "ci-bot"
git add . && git commit -m "deploy(dev): ${{ github.event.repository.name }} → ${{ needs.build.outputs.image_tag }}"
git push
deploy-staging:
needs: build
if: startsWith(github.ref, 'refs/tags/v') && contains(github.ref, '-rc.')
runs-on: ubuntu-latest
environment:
name: staging
url: https://staging-api.smp.vn
steps:
- uses: actions/checkout@v4
with:
repository: trungnguyenchanh/smp-gitops
token: ${{ secrets.GITOPS_PAT }}
path: gitops
- name: Update staging manifest
run: |
cd gitops/environments/staging
yq e -i ".image.tag = \"${{ needs.build.outputs.image_tag }}\"" ${{ github.event.repository.name }}.yaml
git add . && git commit -m "deploy(staging): ${{ github.event.repository.name }} → ${{ needs.build.outputs.image_tag }}"
git push
deploy-prod:
needs: build
if: startsWith(github.ref, 'refs/tags/v') && !contains(github.ref, '-rc.')
runs-on: ubuntu-latest
environment:
name: production
url: https://api.smp.vn
steps:
- uses: actions/checkout@v4
with:
repository: trungnguyenchanh/smp-gitops
token: ${{ secrets.GITOPS_PAT }}
path: gitops
- name: Update prod manifest
run: |
cd gitops/environments/prod
yq e -i ".image.tag = \"${{ needs.build.outputs.image_tag }}\"" ${{ github.event.repository.name }}.yaml
git add . && git commit -m "deploy(prod): ${{ github.event.repository.name }} → ${{ needs.build.outputs.image_tag }}"
git push
- name: Notify Slack
uses: slackapi/slack-github-action@v1
with:
payload: |
{"text": "🚀 Prod deploy: ${{ github.event.repository.name }} v${{ needs.build.outputs.image_tag }}"}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_PROD_WEBHOOK }}
SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK
5. GitOps repo structure¶
Repo smp-gitops (separate from service repos):
smp-gitops/
├── README.md
├── environments/
│ ├── dev/
│ │ ├── order-svc.yaml
│ │ ├── dispatch-engine.yaml
│ │ └── ...
│ ├── staging/
│ └── prod/
└── apps/ # ArgoCD Application manifests
├── dev-apps.yaml
├── staging-apps.yaml
└── prod-apps.yaml
ArgoCD configured to auto-sync environments/dev, manual sync staging + prod.
6. Dockerfile template (Go service)¶
Multi-stage build:
# Build stage
FROM golang:1.22-alpine AS builder
RUN apk add --no-cache git ca-certificates
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build \
-ldflags="-w -s -X main.version=$(git describe --tags --always) -X main.commit=$(git rev-parse HEAD)" \
-o /app/server ./cmd/server
# Runtime stage
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /app/server /app/server
EXPOSE 8080
USER nonroot:nonroot
ENTRYPOINT ["/app/server"]
7. Helm chart structure¶
Per service repo có deployments/helm/:
deployments/helm/order-svc/
├── Chart.yaml
├── values.yaml # default values
├── values-dev.yaml # dev overrides
├── values-staging.yaml
├── values-prod.yaml
└── templates/
├── deployment.yaml
├── service.yaml
├── configmap.yaml
├── hpa.yaml # auto-scale
├── pdb.yaml # pod disruption budget
└── servicemonitor.yaml # prometheus scrape
8. Deployment strategy¶
Rolling update (dev, staging)¶
Blue/green (prod, critical services)¶
Order-svc, dispatch-engine, partner-svc use blue/green via Argo Rollouts:
spec:
strategy:
blueGreen:
activeService: order-svc-active
previewService: order-svc-preview
autoPromotionEnabled: false # require manual promote
scaleDownDelaySeconds: 300
Flow: 1. Deploy new version → preview service 2. SRE smoke test via preview URL 3. Manual promote → swap active/preview 4. Old version sleep 5 min then scale down 5. If issue → 1-click rollback (re-swap)
9. Database migrations in pipeline¶
Migrations run before service deploy:
# Helm pre-install/pre-upgrade hook
apiVersion: batch/v1
kind: Job
metadata:
name: migrate-order-db
annotations:
helm.sh/hook: pre-install,pre-upgrade
helm.sh/hook-weight: "-5"
spec:
template:
spec:
containers:
- name: migrate
image: migrate/migrate:v4.17.0
args:
- "-database=$(MYSQL_DSN)"
- "-path=/migrations"
- "up"
volumeMounts:
- name: migrations
mountPath: /migrations
Rollback safety:
- Tất cả migrations phải reversible (có file .down.sql)
- Breaking schema change: 2-phase deploy (expand → migrate data → contract)
- Step 1: add new column (nullable) → deploy code đọc cả cũ + mới
- Step 2: backfill data
- Step 3: deploy code chỉ dùng mới
- Step 4: drop column cũ
10. Smoke test post-deploy¶
.github/workflows/smoke-test.yml:
on:
workflow_run:
workflows: ["CD"]
types: [completed]
jobs:
smoke:
if: github.event.workflow_run.conclusion == 'success'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Smoke test
run: |
ENV_URL="${{ vars.ENV_API_URL }}"
# Health check
curl -f "$ENV_URL/health" || exit 1
# Read endpoint check
curl -f "$ENV_URL/api/v1/catalog/services" -H "Authorization: Bearer ${{ secrets.SMOKE_TEST_JWT }}" || exit 1
# Critical create endpoint (idempotency)
curl -f -X POST "$ENV_URL/api/v1/test/echo" -d '{"hello":"world"}' || exit 1
11. Monitoring deploys¶
Post-deploy alerts: - Error rate spike > 2% in 5 min → PagerDuty - Latency p95 > 2x baseline in 5 min → PagerDuty - Pod crash loop → Slack
Grafana dashboard "Deploy timeline" overlays deployments with metrics.
12. Release versioning (SemVer)¶
MAJOR.MINOR.PATCH(vd v3.3.0)- Pre-release:
v3.3.0-rc.1,v3.3.0-rc.2 - Tag created khi PR release merged to main
# Release flow
git checkout main && git pull
git tag -a v3.3.0 -m "Release v3.3.0 · Partner platform"
git push origin v3.3.0
# → triggers CD prod deploy with manual approval
CHANGELOG.md updated trong PR release.
13. Rollback playbook¶
# Option 1: ArgoCD rollback
argocd app rollback smp-order-svc-prod <previous-revision>
# Option 2: kubectl rollback (emergency)
kubectl rollout undo deployment/order-svc -n smp-prod
# Option 3: Re-tag previous version
git tag -a v3.3.0-rollback -m "Rollback to v3.2.5"
# manually update gitops repo
Always: - Notify in #prod-alerts Slack - Create incident ticket - Post-mortem within 48h if customer impact
14. Branch protection rules¶
GitHub settings for main:
- Require PR with 1 approver from CODEOWNERS
- Require status checks: lint, test, security, openapi
- Require linear history
- Require signed commits (Phase 2)
- Lock force push
- Auto-delete branches after merge
15. Container image policy¶
- Base: distroless or alpine only
- No
:latesttags allowed - All images scanned by Trivy (block HIGH/CRITICAL)
- SBOM generated and signed (cosign)
- Image registry: GitHub Container Registry (ghcr.io)
- Retention: prod tags forever, dev tags 30 days