Infrastructure & DevOps Lead
Experience: 8–12 Years
Type : Full-Time
Work Mode : On-site
About the Role
We are looking for a hands-on Infrastructure & DevOps Lead who can own the full delivery lifecycle — from designing robust CI/CD pipelines to deploying and operating mission-critical systems at client locations, including offline DMZ environments and cloud infrastructure. This is a lead role: you will drive technical decisions, mentor a team of engineers, and be the go-to person on the ground when it matters most.
If you thrive in complex, restricted-network environments, take pride in zero-downtime deployments, and can bridge the gap between development teams and production infrastructure — this role is for you.
Key Responsibilities
DevOps & CI/CD Leadership
• Design, own, and continuously improve CI/CD pipelines for multi-tech stacks (Java/Spring Boot, .NET Core, Python)
• Implement GitOps practices and pipeline automation across dev, staging, and production environments
• Manage artifact repositories, container image registries, and deployment toolchains — including offline/air-gapped variants
• Drive shift-left security practices: integrate SAST, DAST, and dependency scanning into pipelines
• Define branching strategies, release workflows, and deployment standards across teams
Client-Site Deployments & Offline Environments
• Travel to and operate at client locations to execute deployments in air-gapped, DMZ, and restricted-network environments
• Set up self-hosted infrastructure equivalents: internal package mirrors, container registries, NTP, DNS, and CA within offline
environments
• Execute controlled deployment windows with rollback plans and minimal downtime
• Produce and maintain deployment runbooks, SOPs, and handover documentation for client teams
Team Leadership & Coordination
• Lead and mentor a team of infrastructure and DevOps engineers
• Assign work, conduct technical reviews, and enforce engineering standards
• Coordinate closely with development, DBA, QA, and security teams during release cycles
• Act as the primary technical escalation point for infrastructure and deployment issues
Cloud & Hybrid Infrastructure
• Deploy and manage workloads across cloud platforms (AWS / Azure / GCP) and on-premises environments
• Architect hybrid setups where cloud and offline environments coexist with secure interconnects
• Manage cloud networking, IAM, and security group configurations aligned to client requirements
Linux Platform & Server Operations
• Administer RHEL 9/10 and equivalent SE Linux environments: provisioning, hardening, user management, and capacity planning
• Configure systemd services, log rotation, and disk tuning etc.
Clustered Software Deployments
Set up and operate clustered middleware and data platforms in offline environments, including:
- PostgreSQL — HA with Patroni or equivalent (replication, failover, quorum)
- Redis — standalone and cluster modes
- Kafka — KRaft / Zookeeper-based broker setups
- MongoDB — replica sets and sharded clusters
- ELK Stack / OpenTelemetry — centralized logging and tracing
- Grafana / Zabbix — self-hosted monitoring and alerting
- Tomcat / Apache / Nginx / HAProxy — application hosting and load balancing
- Troubleshoot cluster-level issues: split-brain scenarios, leader elections, replication lag
- Understand quorum, failover concepts, and inter-node communication
Reverse Proxy, Web Security & Networking
• Configure Nginx and Apache as reverse proxies with SSL termination, upstream load balancing, and rate limiting
• Implement and tune ModSecurity WAF with OWASP CRS; manage rule exceptions and false-positive handling
• Enforce web security headers: CSP, HSTS, X-Frame-Options, and CORS policies
• Manage internal PKI, certificate lifecycle, VLANs, firewall rules, and segmented network architectures
High Availability & Disaster Recovery
• Design and implement HA architectures across all infrastructure tiers
• Own DR strategy: define RPO/RTO, conduct drills, validate restoration procedures
• Maintain DR runbooks and ensure team readiness for failover execution
Mandatory Skills Summary
| Category | Details |
|---|---|
| Experience | 8–12 years in infrastructure and DevOps roles |
| Offline / Air-Gapped | Proven experience in air-gapped, DMZ, and restricted-network environments — non-negotiable |
| CI/CD Ownership | Jenkins, GitLab CI, or equivalent — design and full ownership of pipelines |
| Linux Administration | RHEL 9/10 preferred — provisioning, hardening, systemd, capacity planning |
| Clustered Software | PostgreSQL (Patroni), Redis, Kafka, MongoDB, ELK — setup and operations |
| Reverse Proxy & WAF | Nginx / Apache, ModSecurity + OWASP CRS, SSL handling |
| Application Hosting | Java/Spring Boot, .NET Core, Python — as systemd services and Docker containers |
| HA & Disaster Recovery | Architecture design, RPO/RTO definition, DR drills, runbooks |
| Client-Site Deployment | Willingness to travel and work at client locations for deployments and go-lives |
| Documentation | Runbooks, SOPs, architecture diagrams — maintained and current |
Good to Have
• Kubernetes (K8s) — cluster setup, Helm charts, and workload management; air-gapped K8s (e.g., RKE2, K3s) is a strong plus
• Cloud platform experience — AWS, Azure, or GCP (infrastructure provisioning, IAM, VPC/VNet setup)
• Infrastructure as Code — Ansible, Terraform, or equivalent
• Experience with GitOps and
• Security certifications or exposure to compliance frameworks (ISO 27001, SOC 2, or equivalent)
