Data Engineer

Kaizen Global Technologies

Key Responsibilities

  • Design, develop, and maintain scalable batch and real-time data ingestion frameworks using AWS services, Python, PySpark, and SQL.
  • Build, orchestrate, and optimize end-to-end data pipelines using AWS Managed Apache Airflow (MWAA), Amazon Redshift, AWS Glue, and Amazon SageMaker Unified Studio.
  • Develop low-latency streaming solutions using AWS Kinesis, Apache Kafka, or similar technologies, ensuring high availability, performance, and reliability.
  • Implement data transformation, validation, monitoring, governance, and observability solutions while ensuring security through AWS IAM, VPC, and CloudWatch.
  • Collaborate with cross-functional teams to design data lakehouse and data warehouse solutions, automate CI/CD deployments, and deliver scalable, reusable data engineering frameworks.

Skills Required

  • Strong hands-on experience with AWS, Python, PySpark, SQL, Amazon Redshift, Apache Airflow (MWAA), and AWS SageMaker Unified Studio.
  • Expertise in building batch and real-time data ingestion pipelines using AWS Kinesis, Apache Kafka, AWS Glue, APIs, databases, and file systems.
  • Experience with relational and NoSQL databases such as Oracle, Teradata, Amazon RDS, DynamoDB, MongoDB, Snowflake, including data modeling and query optimization.
  • Solid understanding of data warehouse, data lakehouse architectures, ETL/ELT processes, workflow orchestration, monitoring, CI/CD, Git, CloudFormation, and Terraform.
  • Knowledge of machine learning pipeline orchestration, SageMaker IDE (JupyterLab, Spaces), data governance, performance optimization, and cloud security best practices (IAM, VPC, CloudWatch, S3, Lambda, API Gateway, SQS/SNS).

Please drop your CV to ***email_hidden***