




Job Summary: As a Senior Data Engineer, you will build and operate the data backbone of a high-impact digital health product where AI is central, ensuring data quality and governance. Key Highlights: 1. Build and operate the data backbone for a digital health product. 2. Design robust data pipelines and ensure data quality. 3. Collaborate with engineering and AI teams. **1\.****About Marisa.Care** In **2025 we grew 11x**, operating across Brazil’s leading healthcare networks, including **Rede D’Or and Hospitales MAC**, and launching our **international expansion with a hospital network in Mexico**. We were accelerated by **Microsoft for Startups**, joined the **NVIDIA Inception** program, and recently closed a **R$ 8M funding round**, led by **Afya**. **2\.****Job Description** As a Senior Data Engineer, you will be responsible for building and operating the data backbone of a high-impact digital health product where AI is central to the solution. With a hands-on mindset and strong technical seniority, you will design robust data pipelines, ensure the quality and governance of data feeding intelligent models, and closely collaborate with engineering and AI teams to transform clinical and operational data into trustworthy, traceable assets — adhering to the highest security and compliance standards required in the healthcare sector. * **Hands-on Responsibilities** * Design, build, and maintain robust data pipelines (ETL/ELT) that feed both classical ML models and production RAG and LLM systems; * Ensure data quality, governance, and traceability using best practices such as data contracts, dataset and artifact cataloging, and versioning; * Architect and operate MLOps platforms — feature stores, model registries, experiment tracking, model serving, and production performance monitoring; * Implement continuous retraining pipelines, drift detection, and objective criteria for model promotion and rollback based on business and technical metrics; * Integrate data and ML pipelines into CI/CD workflows, ensuring reproducibility and traceability of experiments; * Establish data observability: tracing, logs, data quality metrics, and alerts for pipeline and model degradation; * Actively collaborate with the AI team to build RAG pipelines — ingestion, chunking, indexing, embeddings, and hybrid search; * Ensure compliance with LGPD and applicable healthcare regulations, applying best practices for data masking, PII management, and security-by-design; * **Requirements** * Proven experience as a Senior Data Engineer or equivalent role in high-scale digital products; * Proficiency in Python for data engineering and ML pipelines; * Experience with cloud platforms (Azure, AWS, or GCP) and pipeline orchestration tools (Airflow, Prefect, or equivalents); * Experience with MLOps platforms: MLflow, Databricks, SageMaker, or similar; * Solid knowledge of SQL, NoSQL, and vector databases, as well as messaging/event systems (Kafka, RabbitMQ, or equivalents); * Familiarity with LLMs in production and RAG systems — or strong willingness to learn; * Knowledge of data security, PII masking, and compliance with LGPD and applicable healthcare regulations; * Ability to work autonomously, deliver high-quality outcomes, and clearly communicate technical decisions to both technical and non-technical audiences. * **Nice-to-Have** * Experience with RAG pipeline tracing and evaluation (Langfuse, Ragas, DeepEval, or equivalents); * Knowledge of hybrid search techniques (BM25 + dense), re-rankers, and proprietary embeddings; * Prior experience in healthtech, fintech, or other highly regulated data sectors; * Open-source contributions or active participation in data and AI communities. * **Position Details** Contract type: PJ (individual contractor) Work model: Hybrid — Belo Horizonte


