




Job Summary: We are seeking a Data Engineer to join our team responsible for evolving an AWS-based Data Lakehouse platform, playing a pivotal role in data integration, pipeline orchestration, and delivery of analytical layers. Key Highlights: 1. Solid experience with AWS: S3, Glue Data Catalog, Airflow (MWAA), Redshift. 2. Proficiency in Python, advanced SQL, and Power BI (semantic modeling, DAX). 3. Knowledge of Data Lakehouse architecture and engineering best practices. Description: Technical Requirements: * Solid experience with AWS: S3, Glue Data Catalog, Airflow (preferably MWAA), Python DAGs, Amazon Redshift. * Languages: Python and advanced SQL. * Power BI: semantic modeling, DAX, incremental refresh, gateway, best practices for consuming Redshift. * Knowledge of Data Lakehouse architecture, columnar formats (Parquet), partitioning, and metadata. * Engineering best practices: Git, testing, code reviews, documentation, reliable pipelines. * Security and governance: IAM, encryption, principle of least privilege, LGPD applied to data. We are seeking a Data Engineer to join our team responsible for evolving an AWS-based Data Lakehouse platform. This person will play a pivotal role in integrating data from multiple sources, orchestrating reliable pipelines, and delivering high-performance analytical layers for consumption via Amazon Redshift and Power BI. Key Responsibilities * Design and implement data pipelines using Airflow/MWAA with Python and SQL, adhering to best practices for modularity, testing, and versioning. * Model Bronze/Silver/Gold layers (Medallion architecture) in S3 \+ Glue Data Catalog, defining partitions, formats (Parquet/Delta\*), and tables optimized for querying. * Build and optimize analytical models in Amazon Redshift, ensuring performance and cost efficiency. * Publish and maintain reliable datasets for Power BI, including gateways, incremental refresh, aggregations, and efficient use of DirectQuery/Import. * Collaborate with analysts and business teams to translate requirements into consumable datasets, KPIs, and analytical layers, while documenting data catalogs and data contracts. 2512200202551138868


