




Job Summary: Design, develop, and implement efficient, scalable, and secure data engineering solutions, collaborating closely with cross-functional teams. Key Highlights: 1. Solid experience in AWS and Kubernetes environments. 2. Proficiency in Python, Spark (PySpark), Delta Table, and Web Crawling. 3. Hands-on experience with Apache Airflow, Kafka, and Terraform (IaC). Description: * Postgraduate degree in Data Engineering, Data Science, Computer Science, or related fields; * Solid AWS experience, including EKS, Glue, Athena, IAM, Lambda, ECR, RDS, and DynamoDB; * Advanced knowledge of CI/CD and DevOps practices; * Proficiency in Kubernetes; * Proven experience with Spark (PySpark) and Delta Table; * Demonstrated experience in Web Crawling; * Proficiency in Python and its core data manipulation libraries; * Hands-on experience with Apache Airflow for pipeline orchestration; * Practical knowledge of Kafka; * Experience with Terraform (IaC); * Experience with Azure DevOps; * Experience with Databricks and Snowflake; * Relevant certifications and/or courses in Data Engineering. * Design, develop, and implement highly efficient, scalable, and secure data engineering solutions; * Strategically and collaboratively engage with cross-functional teams to define requirements and guide data architecture; * Implement and optimize CI/CD processes to ensure automation, versioning, and continuous delivery; * Manage and optimize Kubernetes environments, leveraging AWS tools (e.g., EKS) extensively; * Deep utilization of AWS Developer Tools: CodeCommit, CodeBuild, CodeDeploy, and CodePipeline; * Develop and optimize Spark (PySpark) jobs for large-scale data processing, ensuring performance and quality; * Implement and maintain Delta Tables, ensuring data consistency and traceability; * Build and manage Web Crawlers for web data extraction and transformation; * Configure and maintain data pipelines using Apache Airflow, ensuring orchestration and process monitoring; * Implement and administer Kafka environments for real-time data streaming; * Manage infrastructure as code (IaC) using Terraform, enabling reproducible and auditable environments; * Work with Databricks and Snowflake to optimize data storage, performance, and governance; * Contribute to continuous improvement of data engineering practices by proposing innovations and mentoring team members. 2511140202461868021


