




Job Summary: We are seeking a Data Engineer to work on data projects in the AWS cloud, designing, developing, and maintaining scalable data pipelines. Key Highlights: 1. Implementation of data projects on AWS cloud and Apache Spark 2. Development and maintenance of efficient data pipelines 3. Collaboration within multidisciplinary teams and proposal of solutions Description: We are looking for a professional to work as a Data Engineer focused on AWS cloud data projects, using Apache Spark and Python. The candidate will be responsible for designing, developing, and maintaining efficient, scalable, and reliable data pipelines, with strong integration into cloud services and distributed ecosystems. Responsibilities and Duties * Design, implement, and maintain data pipelines (ETL/ELT) using Python, PySpark, and AWS services (S3, Glue, Athena, EMR, Lambda, Redshift). * Work with batch and streaming processes in highly distributed environments. * Model and structure data architectures, data lakes, and data warehouses. * Optimize routines, queries, and data flows, ensuring scalability, security, and high availability. * Collaborate with multidisciplinary teams to understand technical and business requirements, proposing innovative solutions. Implement best practices for data governance, quality, and security. Requirements and Qualifications * Proven experience with AWS (S3, Glue, EMR, Athena, Redshift, Lambda). * Proficiency in Apache Spark (PySpark), including batch and streaming data processing. * Strong knowledge of Python for pipeline development and integrations. * Experience with relational and NoSQL databases (e.g., DynamoDB, MongoDB). * Hands-on experience with Data Lake and Data Warehouse architecture. * Skill in data modeling, analysis, and integration from diverse sources. * Bachelor’s degree in related fields (Computer Science, Systems Engineering, Engineering, or similar). 2511120202181861823


