Zac Peterson
DevOps & Site Reliability Engineer | AI/ML Infrastructure Architect
Helping startups scale from zero to millions of users
Staff DevOps consultant specializing in AI platform engineering, HIPAA-compliant healthcare systems, and production ML pipelines. Building startup infrastructure, GPU clusters, and enterprise cloud architecture across AWS, GCP, and Azure.
Core Competencies
Deep expertise across the modern infrastructure stack, specializing in AI/ML platforms, healthcare cloud systems, and startup scalability
AI Infrastructure & MLOps
Designing production ML pipelines with GPU cluster management, LLM deployment, and model serving at scale for AI companies.
Kubernetes & Container Orchestration
Certified Kubernetes Administrator (CKA) with expertise in cloud-native application deployment and service mesh architecture.
Infrastructure as Code
HashiCorp Terraform Associate certified. Building reproducible, version-controlled infrastructure for rapid startup scaling.
Cloud Architecture
AWS Solutions Architect and Google Cloud Professional certified. Multi-cloud architecture from seed-stage to Series C startups.
DevOps & Platform Engineering
Building developer experience platforms with GitOps practices for zero-to-production deployment velocity.
Site Reliability Engineering
Ensuring 99.99% uptime through SLOs, error budgets, chaos engineering, and incident management best practices.
Observability & Monitoring
Full-stack observability with metrics, logs, and distributed tracing for complex microservices architectures.
Healthcare & HIPAA Compliance
Building secure, HIPAA-compliant infrastructure for digital health platforms with PHI protection, BAA compliance, and FHIR/HL7 integration.
Professional Certifications
Validated expertise in the most sought-after cloud and DevOps technologies
Certified Kubernetes Administrator
Cloud Native Computing Foundation (CNCF)
Expert-level certification validating skills in Kubernetes cluster administration, troubleshooting, and security.
Certified Kubernetes Security Specialist
Cloud Native Computing Foundation (CNCF)
Advanced certification demonstrating expertise in securing Kubernetes clusters and container-based applications.
AWS Solutions Architect Professional
Amazon Web Services
Professional-level certification for designing distributed systems and complex AWS architectures.
Google Cloud Professional Cloud Architect
Google Cloud
Professional certification for designing, developing, and managing Google Cloud solutions.
HashiCorp Terraform Associate
HashiCorp
Certification validating Infrastructure as Code skills using Terraform across multi-cloud environments.
Tools & Technologies
Production-tested technologies for building scalable infrastructure across AI platforms, healthcare systems, and high-growth startups
Kubernetes
Container Orchestration
AWS
Cloud Provider
GCP
Cloud Provider
Azure
Cloud Provider
Terraform
Infrastructure as Code
Pulumi
Infrastructure as Code
PostgreSQL
Database
Redis
In-Memory Cache
Prometheus
Observability
Grafana
Visualization
Datadog
Observability
ArgoCD
Continuous Delivery
Vault
Secrets Management
Istio
Service Mesh
NVIDIA GPU
AI/ML Acceleration
Docker
Containerization
SageMaker
MLOps Platform
Vertex AI
MLOps Platform
vLLM
LLM Serving
HIPAA
Healthcare Security
Professional Experience
8+ years building and scaling enterprise cloud infrastructure across healthcare startups, AI companies, Fortune 500 enterprises, and high-growth DTC brands
Staff DevOps Engineer - AI/ML Infrastructure
Homeward Health
Healthcare Startup | Digital Health Platform
- Reduced deployment cycles by 70% by engineering automated CI/CD pipelines with AWS CodePipeline and CodeBuild for containerized Python, Go, and SQL microservices
- Designed HIPAA-compliant Kubernetes infrastructure on AWS EKS with secure ML pipelines leveraging SageMaker and encryption at rest and in transit for PHI protection
- Architected FHIR-compliant data ingestion and transformation workflows using AWS Kinesis, Lambda, and Redshift for processing HL7 messages at scale
- Achieved 45% reduction in cloud costs through intelligent resource scaling, spot instance utilization, and comprehensive CloudWatch monitoring
- Established model versioning, experiment tracking, and automated retraining workflows using SageMaker and MLflow for production ML pipelines
Staff DevOps Architect - AI/ML Infrastructure
BP
Fortune 500 | Enterprise AI Platform
- Reduced infrastructure costs by 40% by architecting GKE and AKS clusters with GPU acceleration for 500+ concurrent AI/ML workloads
- Cut model deployment time from weeks to hours by building automated MLOps pipelines with Vertex AI and custom model serving infrastructure
- Achieved 99.9% uptime by establishing observability with Prometheus, Grafana, ELK Stack and GitOps workflows
- Enabled processing of 10TB+ daily data by designing transformation pipelines with Apache Spark and Databricks, managing 1000+ ML models in production
- Reduced MTTR from 4 hours to 15 minutes and ensured SOC2 compliance by leading Site Reliability Engineering practices
Senior Cloud Architect
Vuori
DTC E-commerce | High-Traffic Retail Platform
- Reduced cloud costs by 60% by restructuring and automating resources across GCP and Azure using Terraform Infrastructure as Code
- Architected and deployed Kubernetes clusters on GCP GKE and Azure AKS, utilizing Vertex AI for scalable ML workflows and demand forecasting
- Designed hybrid networking between on-premises and cloud, building ETL pipelines for data migration supporting Black Friday traffic scaling
- Containerized legacy applications and migrated workloads to GCP and Azure with automated deployment pipelines for rapid iteration
Senior DevOps Engineer
Wickfire LLC
Legal Tech Startup | SaaS Platform
- Reduced infrastructure downtime by 20% by designing and scaling backend IT infrastructure with comprehensive SLAs and SLOs
- Accelerated software delivery by 30% by developing automated GitLab CI/CD pipelines on GCP for rapid startup iteration
- Authored reusable Terraform modules for automated provisioning of GCP, Azure, and Vertex AI resources enabling zero-to-production deployment
- Implemented advanced monitoring and alerting with GCP Operations Suite and Azure Monitor for full-stack observability
Managed Services Practice Lead
Redapt Inc.
Cloud Consulting | Enterprise Solutions
- Reduced cloud expenditure by 60% by restructuring and optimizing GCP and Azure resources through Terraform and Infrastructure as Code
- Architected GKE and AKS clusters for startups and enterprises, containerizing applications and automating deployment pipelines
- Built CI/CD pipelines with GitLab and GCP Cloud Build, orchestrating deployments with Python, Go, and Terraform
- Maintained ISO, HIPAA, and FedRAMP compliant infrastructure for healthcare and government clients with sensitive data requirements
Solutions Architect
Rapid Domains Inc.
Web Hosting | Multi-Cloud Infrastructure
- Improved cloud infrastructure reliability for 200+ clients by designing scalable solutions in hybrid Azure and GCP environments
- Reduced manual operations workload by 40% by developing automation scripts in Python and Go for startup clients
- Enhanced database performance for mission-critical workloads by optimizing SQL queries and implementing automated monitoring
Let's Build Something Great
Looking to improve system reliability, performance, or scalability? I offer free technical consultations to help optimize your infrastructure and deployment processes.
Contact Information
zac@zacp.xyz
Phone
(858) 922-5573
Location
San Francisco, CA
Availability
Free consultations available