Shivam Gupta
Cloud Computing & AI/ML Infrastructure Expert with 7+ years of experience building scalable, enterprise-grade solutions at Microsoft Azure
Professional Summary
Driving next-generation cloud infrastructure and AI/ML platforms at Microsoft Azure
7+ Years of Experience
Deep expertise in cloud computing, distributed systems, container orchestration, and AI/ML infrastructure at Microsoft Azure.
Azure Container Orchestration Leader
Led the end-to-end architecture and delivery of Azure Container Instance Orchestration Platform with advanced orchestration features—auto-scaling, load balancing, rolling upgrades, and self-healing—for enterprise-grade container workloads.
AI/ML Infrastructure Architect
Driving AI/ML infrastructure for Azure AI and Copilot teams, enabling GPU-backed container orchestration, high-throughput pipelines, and optimized compute environments for large-scale model training and inference.
Open-Source Integration
Integrated the serverless containers platform into key open-source projects, accelerating adoption and driving onboarding of high-value enterprise customers.
Massive Scale Infrastructure
Designed and scaled serverless container infrastructure supporting 500M+ container workloads globally, building key features such as confidential containers, networking, overcapacity allocation, and E2E testing frameworks.
Scalable & Resilient Systems Expert
Expert in building scalable, resilient services with strong focus on reliability, operational excellence, and distributed system design for next-generation AI workloads.
Professional Experience
Software Engineer 2
- Led end-to-end development of Azure serverless container orchestration platform with auto-scaling, rolling upgrades, load balancing, and self-repair features
- Drove AI/ML infrastructure for Azure AI and Copilot teams, designing GPU allocation strategies and high-throughput compute pipelines
- Scaled Azure Containers platform to support 500M+ container workloads globally
- Contributing to Radius open-source project, collaborating with Azure CTO's team
- Built monitoring platforms, E2E testing frameworks, and alerting systems improving service reliability
Software Engineer
- Designed and developed full-stack order management platform for call center teams
- Built automated resolution engine providing financially optimal solutions for customer issues
- Improved order resolution time by 20% through system enhancements and workflow optimization
- Collaborated with cross-functional teams to boost efficiency and enhance customer satisfaction
Software Engineer
- Developed real-time metrics and analytics platform for business teams
- Designed RESTful microservices using Spring Boot, Java, and Apache Spark
- Configured and optimized Spark clusters, improving data processing efficiency by 30%
- Contributed to data-driven insights and reporting pipelines supporting critical business decisions