




Job Summary: We are seeking a Data Engineer to map, document, and integrate data sources, designing and operating data pipelines with emphasis on security, performance, and compliance. Key Highlights: 1. Working on mapping and documentation of internal and external data sources 2. Responsible for designing, building, and operating data pipelines 3. Focus on security, performance, reliability, and compliance (LGPD) **Job Description:** We are looking for a **Data Engineer** to perform **data source mapping and documentation** in collaboration with internal teams and **external vendors** (e.g., **SGA, PROSIS, ServiceNow**), defining **ingestion specifications**, **connector standardization**, and **ETL/ELT mechanisms**. You will be responsible for **designing, building, and operating pipelines**, focusing on **security, performance, reliability, and compliance** (governance/LGPD), ensuring **data availability** for consumption layers and the corporate data catalog. **Responsibilities** * Gather technical requirements from internal departments and partners (SGA, PROSIS, ServiceNow, etc.) to **map, classify, and document** data sources (structured, semi-structured, and APIs). * Define **ingestion specifications** (batch/streaming), **connector standards**, and **data contracts** (schemas, SLAs/SLOs, versioning). * Design and implement resilient and observable **ETL/ELT pipelines** (reprocessing, idempotency, alerts), with **data quality monitoring** (DQ checks) and **end-to-end lineage**. * Optimize **performance and cost** (partitioning, clustering, compression, parallelism), applying **FinOps** where applicable. * Publish datasets across **Bronze/Silver/Gold layers** and into the **corporate data catalog** (metadata, access policies, classification). * Ensure **security and compliance**: IAM, encryption, masking, anonymization/pseudonymization, retention, and auditing — aligned with **LGPD** and governance policies. * Operate and evolve **orchestration** (job execution/retries, SLAs), conduct **tuning/troubleshooting**, and support analytics/BI teams in data consumption. * Collaborate with Product/Business teams on **business rule definition** and **handoff to consumption layers** (APIs, views, marts). **Requirements** * Proven experience in **data engineering**, including building and operating **ETL/ELT pipelines** (batch and/or streaming). * Strong **SQL** and **Python** skills; experience with **Spark** and/or **dbt** is a plus. * Hands-on experience with **cloud data platforms** (preferably **GCP**: **BigQuery, Dataflow/Dataproc, Pub/Sub, Cloud Storage, Cloud Composer, Dataplex**; or equivalent AWS/Azure services). * Knowledge of **data catalog and lineage tools** (e.g., **Dataplex/Data Catalog/Atlas**), **data quality**, and **analytics modeling** (Medallion architecture, marts). * Experience with **orchestration** (Airflow/Composer), **Git/CI\-CD**, and **observability** (logs, metrics, alerts). * Familiarity with **security and privacy** (IAM, encryption, LGPD), **API-based integration**, and connectors (REST, JDBC/ODBC). * Ability to produce technical documentation and **communicate effectively** with business units and vendors. * **Bachelor's degree** in IT, Engineering, Computer Science, Information Systems, or related fields; certifications (e.g., **GCP Data Engineer**, **dbt**, **ITIL/DAMA**) are advantageous. ### **Employment Type:** CLT ### **Benefits:** Meal Voucher, Transportation Voucher, Culture Voucher, Health Insurance, Life Insurance ### **Department:** Corporate


