




Job Summary: We are seeking a Data Engineer to work on AWS cloud projects, focusing on Spark and Python, developing and maintaining scalable data pipelines. Key Highlights: 1. Work on cloud-based AWS data projects 2. Develop efficient data pipelines using Apache Spark and Python 3. Collaborate with cross-functional teams **Description:** We are looking for a professional to serve as a Data Engineer focused on cloud-based AWS data projects, utilizing Apache Spark and Python. The candidate will be responsible for designing, developing, and maintaining efficient, scalable, and reliable data pipelines, with strong integration into cloud services and distributed ecosystems. Responsibilities and Duties Design, implement, and maintain data pipelines (ETL/ELT) using Python, PySpark, and AWS services (S3, Glue, Athena, EMR, Lambda, Redshift). Work with batch and streaming processes in highly distributed environments. Model and structure data architectures, data lakes, and data warehouses. Optimize routines, queries, and data flows, ensuring scalability, security, and high availability. Collaborate with cross-functional teams to understand technical and business requirements, proposing innovative solutions. Implement best practices for data governance, quality, and security. Requirements and Qualifications Proven experience with AWS (S3, Glue, EMR, Athena, Redshift, Lambda). Proficiency in Apache Spark (PySpark), including batch and streaming data processing. Strong knowledge of Python for pipeline development and integrations. Experience with relational and NoSQL databases (e.g., DynamoDB, MongoDB). Hands-on experience with Data Lake and Data Warehouse architecture. Skills in data modeling, analysis, and integration from diverse sources. Bachelor's degree in related fields (Computer Science, Information Systems, Engineering, or similar). 2512060202191760865


