···
Log in / Register

DataOps Engineer

Indeed
Full-time
Onsite
No experience limit
No degree limit
79Q22222+22
Favourites
Share
Some content was automatically translatedView Original

Description

Job Summary: A professional to design, provision, and manage data infrastructure; build and optimize CI/CD pipelines for data and ML; and implement monitoring and data governance. Key Highlights: 1. Working with Infrastructure as Code and Automation. 2. Optimizing CI/CD pipelines for data and ML. 3. Implementing observability and data governance. Description: * Proficiency in Python, SQL, and script automation (shell). * Solid experience with Infrastructure as Code (Terraform) for data environments. * Practical experience building CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins) applied to data and ML. * Deep knowledge of workflow orchestrators, especially Apache Airflow. * Hands-on experience processing large-scale data using Apache Spark (batch and streaming). * Familiarity with the modern data ecosystem, including tools such as dbt, Kafka, Flink, and Airbyte. * Experience implementing monitoring and observability using tools like Prometheus, Grafana, or the ELK Stack. Differentiators (Advanced Knowledge): * Experience across multiple cloud platforms (AWS, GCP, Azure) and Databricks. * Knowledge of modern architectures such as Data Lakehouse and Data Mesh. * Experience with data testing frameworks (e.g., Great Expectations, Soda). * Experience with container orchestration using Kubernetes (and tools such as Kubeflow). * Knowledge of analytical databases (OLAP) such as ClickHouse, Pinot, or Trino. * Familiarity with governance tools such as Unity Catalog or OpenMetadata. * Infrastructure as Code (IaC) and Automation: Design, provision, and manage our data infrastructure (AWS, GCP) using Terraform. Automate everything—from network and permission configurations to cluster auto\-scaling. * Orchestration and CI/CD for Data: Build and optimize CI/CD pipelines (GitHub Actions) to automate testing, schema validation, and secure deployment of data flows, transformations (dbt), and ML models. * Robust Data Pipelines: Orchestrate complex data flows (batch and streaming) using tools such as Airflow, Kafka, and Flink—ensuring automatic failure recovery, proactive alerting, and efficient execution. * Observability and Reliability (Data SRE): Implement a comprehensive monitoring system using Prometheus, Grafana, or Datadog. Define and enforce SLAs, SLOs, and SLIs for data pipelines; lead incident response and root cause analysis. * Data Quality and Governance: Integrate automated data testing frameworks (Great Expectations, dbt tests) into the development lifecycle. Implement data governance with schema versioning, cataloging, and lineage tracking (data lineage). * Performance and Cost Optimization (FinOps): Optimize performance and cost of distributed data pipelines and workloads. Implement partitioning, caching, and parallelization strategies to ensure efficient use of cloud resources. * Strategic Partnership: Act as a technical partner to Data Engineering, ML Engineering, and Data Science teams—providing a stable platform and reliable data to accelerate innovation and value delivery. 2511190202461821158

Source:  indeed View original post
João Silva
Indeed · HR

Company

Indeed
Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.