Data Engineer
Kaizen Global Technologies
Key Responsibilities
- Design, develop, and maintain scalable batch and real-time data ingestion frameworks using AWS services, Python, PySpark, and SQL.
- Build, orchestrate, and optimize end-to-end data pipelines using AWS Managed Apache Airflow (MWAA), Amazon Redshift, AWS Glue, and Amazon SageMaker Unified Studio.
- Develop low-latency streaming solutions using AWS Kinesis, Apache Kafka, or similar technologies, ensuring high availability, performance, and reliability.
- Implement data transformation, validation, monitoring, governance, and observability solutions while ensuring security through AWS IAM, VPC, and CloudWatch.
- Collaborate with cross-functional teams to design data lakehouse and data warehouse solutions, automate CI/CD deployments, and deliver scalable, reusable data engineering frameworks.
Skills Required
- Strong hands-on experience with AWS, Python, PySpark, SQL, Amazon Redshift, Apache Airflow (MWAA), and AWS SageMaker Unified Studio.
- Expertise in building batch and real-time data ingestion pipelines using AWS Kinesis, Apache Kafka, AWS Glue, APIs, databases, and file systems.
- Experience with relational and NoSQL databases such as Oracle, Teradata, Amazon RDS, DynamoDB, MongoDB, Snowflake, including data modeling and query optimization.
- Solid understanding of data warehouse, data lakehouse architectures, ETL/ELT processes, workflow orchestration, monitoring, CI/CD, Git, CloudFormation, and Terraform.
- Knowledge of machine learning pipeline orchestration, SageMaker IDE (JupyterLab, Spaces), data governance, performance optimization, and cloud security best practices (IAM, VPC, CloudWatch, S3, Lambda, API Gateway, SQS/SNS).
Please drop your CV to ***email_hidden***