Senior Data Engineer

Posted 4 days 13 hours ago by Luxoft

Permanent
Full Time
Other
London, United Kingdom
Job Description
Overview

Project description

We're seeking a highly skilled and motivated Senior Data Engineer to join our growing data team. In this role, you'll architect and maintain robust, scalable data pipelines and infrastructure that power our analytics, machine learning, and business intelligence initiatives. You'll work with cutting-edge technologies like Python, PySpark, AWS EMR, and Snowflake, and collaborate across teams to ensure data is clean, reliable, and actionable.

Responsibilities
  • Build and maintain scalable ETL pipelines using Python and PySpark to support data ingestion, transformation, and integration.
  • Develop and optimize distributed data workflows on AWS EMR for high-performance processing of large datasets.
  • Design, implement, and tune Snowflake data warehouses to support analytical workloads and reporting needs.
  • Partner with data scientists, analysts, and product teams to deliver reliable, well-documented datasets.
  • Ensure data integrity, consistency, and accuracy across multiple sources and systems.
  • Automate data workflows and processes to improve efficiency and reduce manual intervention.
  • Monitor pipeline performance, identify bottlenecks, and resolve issues proactively.
  • Apply best practices in CI/CD, version control (e.g., Git), and infrastructure-as-code (e.g., Terraform, CloudFormation).
  • Enforce data security, compliance, and governance standards in line with industry regulations.
  • Mentor junior engineers, conduct code reviews, and foster a culture of continuous learning and knowledge-sharing.
Skills

Must have

  • 5+ years of experience in data engineering or software development.
  • Strong proficiency in Python and PySpark.
  • Hands-on experience with AWS services, especially EMR, S3, Lambda, and Glue.
  • Deep understanding of Snowflake architecture and performance tuning.
  • Solid grasp of data modeling, warehousing concepts, and SQL optimization.
  • Familiarity with CI/CD tools (e.g., Jenkins, GitHub Actions) and infrastructure-as-code.
  • Experience with data governance frameworks and security best practices.
  • Excellent communication and collaboration skills.

Nice to have

N/A