Job Description
Job DescriptionDevOps Engineer
U.S. GenAI startup, Cambridge Office
Full-Time Employment with We. We are committed to building a transformative AI platform that revolutionizes software development. Our goal is to enable you to have a long, impactful career with us, with opportunity for advancement. If you want a role where you can shape the future of AI-powered infrastructure, read on!
About Us
We are a Boston, MA based Generative AI Start-up on a mission to automate custom software creation to unlock the next industrial revolution. We're building an AI-powered platform capable of autonomously generating enterprise-grade software, powered by thousands of cooperative AI agents working in concert.
We’re backed by multiple tier 1 investors, have success as founders at our previous start-up, and hold dozens of Generative AI patents.
Location: 1 Kendall Square, Cambridge, MA (In-person role)
About the Role
We’re looking for an exceptional DevOps Engineer to architect and maintain the infrastructure that powers our revolutionary AI agent ecosystem. You’ll be instrumental in building scalable, resilient systems that support both our cutting-edge AI platform and modern applications. This role offers the unique opportunity to work at the intersection of traditional DevOps and emerging AI infrastructure, creating systems that enable thousands of AI agents to collaborate seamlessly.
As our DevOps Engineer, you’ll take ownership of our entire infrastructure stack—from Kubernetes orchestration to AI agent deployment pipelines. You’ll work directly with our engineering teams to ensure our platform can scale to support enterprise customers while maintaining the performance and reliability they demand.
What Success Looks Like
-
Architect and implement robust Kubernetes infrastructure that scales effortlessly to support our growing AI agent ecosystem
-
Create sophisticated CI/CD pipelines that enable rapid, reliable deployment of both traditional services and AI agents
-
Develop Python-based automation to eliminate manual tasks and accelerate development velocity
-
Design monitoring and observability systems for deep insights into both infrastructure and AI agent performance
-
Optimize cloud infrastructure for cost-efficiency while maintaining enterprise-grade reliability
-
Collaborate effectively with development teams to improve developer experience and productivity
-
Proactively identify and resolve infrastructure bottlenecks before they impact customers
-
Establish infrastructure best practices to support rapid growth
-
Build systems that handle the unique challenges of AI workloads at scale
-
Maintain 99.9%+ uptime for critical production services
Areas of Ownership
Core Infrastructure:
-
Kubernetes cluster design, deployment, and management for AI and application workloads
-
Infrastructure as Code using Terraform for multi-cloud environments
-
Container orchestration and optimization for AI agent deployment
-
Network architecture and security for distributed systems
Automation & Tooling:
-
Python-based automation scripts for infrastructure management
-
Helm chart development and maintenance for application deployment
-
CI/CD pipeline design using modern DevOps tools
-
Developer productivity tooling and automation
Monitoring & Reliability:
-
Comprehensive monitoring, alerting, and tracing systems
-
Performance optimization for AI workloads
-
Incident response and disaster recovery planning
-
Cost optimization and resource management
AI Infrastructure (Unique to Us):
-
Infrastructure for AI agent orchestration and management
-
MLOps pipeline integration
-
Scalable systems for handling AI model deployment
-
Resource optimization for GPU/compute-intensive workloads
Required Technical Experience
-
5–8 years of DevOps/Infrastructure experience
-
Expert-level Python proficiency for automation and scripting
-
Deep Kubernetes expertise: deployment, scaling, troubleshooting, and optimization
-
Strong experience with Helm for application package management
-
Proven track record designing and implementing CI/CD pipelines
-
Hands-on experience with major cloud platforms (AWS, Azure, or GCP)
-
Terraform expertise for Infrastructure as Code
-
Strong Linux administration and containerization (Docker) skills
-
Experience with monitoring tools (Prometheus, Grafana, ELK stack)
-
Understanding of microservices architecture and distributed systems
Ways to Stand Out
-
CKA (Certified Kubernetes Administrator) or CKAD certification
-
Experience with MLOps tools (MLflow, Kubeflow, Ray, etc.)
-
Knowledge of AI/ML infrastructure requirements and optimization
-
Experience with GPU orchestration and management
-
API gateway and service mesh implementation (Istio, Linkerd)
-
GitOps experience (ArgoCD, Flux)
-
Experience scaling infrastructure for high-growth startups
-
Contributions to open-source infrastructure projects
-
Experience with multi-region, highly available deployments
-
Background in security and compliance (SOC2, HIPAA)
You’ll Get…
-
Competitive Salary
-
Comprehensive health, dental, and vision insurance
-
401(k) with company match
-
Flexible PTO policy
-
$5,000 annual professional development budget
-
Latest hardware and software tools
-
The opportunity to shape infrastructure for the future of software development
-
Work with cutting-edge AI technology and world-class engineers
-
Modern office in Cambridge’s innovation hub
-
Regular team events and activities
-
The chance to solve novel infrastructure challenges at the intersection of DevOps and AI
Culture
Who we are: Our founding team consists of a Serial Gen AI Inventor and a successful Serial Entrepreneur. We work hard, maintain a curious mindset, and believe in a low-ego, high-output approach.
We move fast. Time is our most precious asset. We make decisions quickly and iterate rapidly, believing that a good decision today beats a perfect decision next week.
We have a Championship Mindset. We operate like a professional team—winning together by maintaining high standards, supporting each other, and staying laser-focused on our mission.
We have a Passion for Invention. As technologists pushing the boundaries of what’s possible with AI, we thrive on solving problems that haven’t been solved before.
What We Ask of You
This role requires someone who thrives in ambiguity and loves tackling unprecedented challenges. You’ll be building infrastructure for a type of platform that’s never existed before—one where thousands of AI agents collaborate to write software. This means being comfortable with rapid change, continuous learning, and creative problem-solving.
You should be excited about working independently while collaborating in-person with our team at our Cambridge headquarters. The ability to communicate complex technical concepts clearly and work effectively with both technical and non-technical stakeholders is essential.
To Apply
Apply with your resume and a brief note about:
-
Your most challenging infrastructure project and how you solved it
-
Why you’re excited about building infrastructure for AI-powered software development
Interview Process
Here’s what you can expect:
-
Initial screening call (30 minutes)
-
Technical discussion with our team (45 minutes)
-
Deep dive system design (60 minutes)
-
Final conversation with leadership (45 minutes)
-
Offer discussion
We are an equal opportunity employer committed to building a diverse and inclusive team.
