Chennai, Tamil Nadu
DevOps Specialist #1055775Job Description:
- Employees in this job function are responsible for building, testing and maintaining the infrastructure and tools for software/ code development, deployment and release.
- They also help the organization speed up and automate aspects of the software lifecycle through a series of DevOps practices and processes
Key Responsibilities:
- Imbibe DevOps methodology and practices across the product lifecycle for various software components
- Accelerate software releases while ensuring security and mitigating the risk of failed/ recalled software releases
- Ensure increase in cycle speed between feedback from end-user to an updated software by leveraging continuous integration, continuous deployment, and continuous feedback
- Assess current systems and processes, and identify and implement ways to improve them
- Utilize platform and enterprise tools and infrastructure to facilitate DevOps methodology and approach
- Assist code/ software releases and deployments in various environments like production, testing, sandbox etc.
- Identify and spearhead creation of automation and innovation opportunities throughout the software product lifecycle
Skills Required:
- Cloud Function, Python, Big Query,, GCP Cloud Run, GitHub, TERRAFORM, API, CI/CD, GCP , Java, Tekton
Skills Preferred:
- react js
Experience Required:
- Specialist Exp: 5+ experience in relevant field
Experience Preferred:
- LLM Ops and ML Ops Core Tech: GCP, Terraform, Python, GitHub Actions, Astronomer, Dataflow, Cloud Run
- Cloud: Expert-level in any Cloud platform Google Cloud Platform (GCP) preferred.
- Languages: Expert Python (required); proficiency in Java (for data utility troubleshooting).
- Tools: Deep knowledge of GitHub Actions, Tekton, Cloud Build, and Astronomer.
- Security: Hands-on experience with FOSSA, SonarQube, 42Crunch, and Cycode.
- GCP Services: Cloud Run, Cloud Run Functions, Cloud Functions, Artifact Registry, IAM, Secret Manager, VPC, BigQuery, and Dataflow.
- Data/Search: Elasticsearch (ILM and Indexing) and BigQuery optimization.
Education Required:
- Bachelor's Degree
Additional Information :
Key Responsibilities
- Infrastructure & Orchestration (IaC & CI/CD)
- Architect and maintain multi-environment infrastructure using Terraform.
- Design and manage complex CI/CD pipelines using GitHub Actions, Tekton, and Cloud Build.
- Orchestrate data pipelines using Astronomer (Airflow) to ensure reliable execution of Java and Python-based utilities.
- Data & Search Operations (DataOps)
- Support Dataflow (Apache Beam) ETL jobs, focusing on autoscaling, VCPU quota management, and the implementation of Dead Letter Queues (DLQ) for failed records.
- Manage Elasticsearch indices, including the implementation of Index Lifecycle Management (ILM) to balance performance and storage costs.
- Optimize BigQuery performance through partitioning, clustering, and slot management strategies.
- Advanced Security & Governance
- Integrate and manage "Shift Left" security tools: FOSSA, SonarQube, 42Crunch, and Cycode.
- Implement Binary Authorization to ensure only verified, scanned images from Artifact Registry are deployed.
- Enforce strict IAM roles, Secret Management, and VPC security (Service Controls) to protect sensitive data. 4. Observability & Reliability (SRE)
- Develop comprehensive monitoring dashboards and alerting in GCP Cloud Monitoring/Logging.
- Define and track SLIs/SLOs for all five data flows to ensure system health.
- Establish Disaster Recovery (DR) and backup protocols for BigQuery datasets and Elasticsearch snapshots.
- Cost & API Management
- Implement FinOps practices to monitor and reduce cloud spend across Dataflow, BigQuery, and Cloud Run.
- Manage outbound API connections and credentials securely, potentially utilizing GCP API Gateway or Apigee.