




Job Summary: Generative AI Data Lead responsible for managing a multidisciplinary team and providing technical direction for Generative AI data initiatives, with emphasis on quality and cost optimization. Key Highlights: 1. Lead a multidisciplinary team (engineering, analytics, data science) 2. Define and maintain data architecture for Generative AI applications 3. Ensure quality, observability, and cost-efficiency in DataOps for AI We are a **100% Brazilian** organization, passionate about building homegrown technology and focused on solutions and innovation. We deliver ICT (Information and Communication Technology) solutions applied across multiple sectors, including telecom, agribusiness, finance, utilities, industry, smart cities, retail, and defense & security services. We believe that through our **Way of Being and Doing**, we will achieve sustainable results, contributing to the development of our people and society. Our values define our essence: who we are, what inspires us, and what makes us unique. They constitute our **Way of Being**. **Institutional Values \- COCREATE:** **C**ollaboration **O**rientation to customers **C**onfidence **R**espect **I**nnovation **A**daptability **R**esults Based on our Way of Being, we define the competencies and behaviors that reflect, in practice, our **Way of Doing** \- Leadership and Influence, Integration and Cooperation, Active Listening, Ethics and Integrity, Emotional Intelligence, Curiosity and Learning, Digital Thinking, Systems Thinking, Resilience and Flexibility, Planning and Time Management, Value Generation, Data-Driven Decision Making Do you identify with our **Way of Being and Doing**? Then join us in **CO-CREATING** as part of our team as a **Generative AI Data Lead**! **Your day-to-day challenges will include:** * **People management and team operations** + Lead a multidisciplinary team (data engineering, analytics, and data science), including performance management, recurring feedback, 1:1 meetings, development plans, and career progression. + Plan and manage capacity, priorities, and deliverables (roadmap, backlog, SLAs/SLOs, risks, and dependencies). + Establish execution rituals and standards (quality, technical reviews, documentation, incident management, and continuous improvement). + Serve as the interface with stakeholders (Product, Engineering, Security, Legal/Compliance), aligning expectations and negotiating trade-offs among timeline, scope, cost, and risk. + Foster a culture of operational excellence, impact focus, and accountability for data and outcomes. * **Technical leadership: data for Generative AI** + Define and maintain data architecture using the Medallion structure, with publication and consumption standards (data products) tailored for Generative AI applications. + Design and evolve pipelines for assembling LLM-specific datasets, including: source selection and curation, cleaning and normalization, deduplication, enrichment, standardization, anonymization where applicable, and metadata generation. + Implement dataset versioning and reproducibility (snapshots, lineage, dataset registry where applicable), ensuring traceability and auditability. + Define data contracts, taxonomies, and labeling/metadata standards to improve dataset discoverability, quality, and governance. + Ensure AI-specific data quality controls for consumption by LLMs and retrieval systems, including criteria for consistency, completeness, freshness, and source reliability. * **Quality, observability, and cost (DataOps for AI)** + Establish end-to-end metrics and monitoring: data quality, latency, failures, costs, RAG index coverage, and indicators of impact on AI application performance. + Standardize incident response and post-mortems (RCA), including runbooks, alerts, and preventive processes. + Optimize cost and performance (storage, compute, orchestration, partitioning, incremental updates, and retention policies). **We are looking for someone with:** * Completed undergraduate degree. * Solid experience in data engineering: ETL/ELT, modeling, orchestration, quality, and governance. * Practical proficiency in Python and SQL. * Proven experience with data architecture (data lake/lakehouse/data warehouse) and practical application of the Medallion framework (Bronze/Silver/Gold). * Experience building ML/analytical datasets and understanding specific dataset requirements for LLMs (curation, versioning, metadata, and governance). * Engineering practices: Git, code reviews, automated testing, and CI/CD (or equivalent). * Ability to make architectural decisions considering trade-offs among cost, risk, latency, and quality. * Proven experience leading/managing people (or technical leadership with formal team leadership responsibilities). * Ability to structure backlogs and goals, negotiate priorities, resolve conflicts, and develop people. * Strong communication skills to align technical and non-technical teams, and guide data- and evidence-based decisions. * Availability for hybrid work in Campinas/SP. **Nice-to-have qualifications:** * Master’s and/or PhD. * Experience with Data-Centric AI (curating and improving data as the primary lever for quality). * Production experience with RAG (indexing, updating, re-ranking, evaluation, and observability). * Knowledge of MLOps (reproducibility, traceability, monitoring, model lifecycle management). * Experience with orchestration/transformation/streaming tools and modern stacks (e.g., Airflow/Dagster, dbt, Spark, Kafka or equivalents). * Experience with cloud platforms and containers (Docker; AWS/GCP/Azure and/or on-premises environments), security, and secret management. **CPQD offers:** * Daycare or babysitter allowance; * Unimed health insurance; * Uniodonto dental insurance; * Vidalink medication allowance; * Meal/food voucher; * Life insurance; * Pet health plan; * Vacation bonus above market standards; * Private pension plan; * WellHub; * Commuter shuttle; * Free parking; * Payroll-deducted loans; * Birthday day off; * Holiday extensions; * Citizen company: extended maternity and paternity leave; * Postgraduate support program; * Partnerships and agreements: facilitating options for health, leisure, culture, and education; * Internal programs: focused on development and well-being. We take pride in who we are and believe this stems from our **diversity**! Therefore, we invest in our Diversity, Equity & Inclusion program: **CPQD+**, strengthening our practices to ensure equal opportunity regardless of race, gender, age, religion, disability, or sexual orientation, and reaffirming our commitment to harnessing differences to foster an innovative, sustainable, and safe environment where everyone can reach their full potential. Join our team and Connect to the New. \#ConecteseAoNovo \#Diversidade \#Tecnologia \#CPQD


