Leave us your email address and we'll send you all the new jobs according to your preferences.
Software Engineer - Pyspark - Fully remote
Posted 4 hours 33 minutes ago by Cloud Consulting
Cloud Consulting have an urgent requirement for an experienced PySpark Developer to work on a high-profile project for a leading company.
The role is fully remote and is inside of IR35.
You will require active CTC Clearance.
Key Duties and Responsibilities:
. Develop and optimise PySpark batch pipelines that process Parquet data and use Delta Lake for all IO, applying validation, enrichment, and billing calculation logic directly in PySpark code.
. Build reliable PySpark jobs that read/write Delta tables on ADLS Gen2.
. Implement in-code validations (schema, null/range/value checks, referential lookups), routing rejects to dedicated Delta quarantine tables.
. Design and implement billing logic - tariff/charge models, tiered pricing, pro-rata handling, VAT/discounts, adjustments, and full auditability.
. Externalise billing and validation rules via versioned JSON configs, ensuring deterministic, idempotent re-runs.
. Optimise Delta operations (MERGE, OPTIMIZE, Z-ORDER, VACUUM) and incremental/CDC merges into Azure SQL.
. Tune performance (partitioning, caching, broadcast joins) and maintain robust retries, checkpoints, and structured logging.
. Integrate with orchestrators (ADF or Container App Orchestrator) and CI/CD pipelines (GitHub Actions).
. Operate securely within private-network Azure environments (Managed Identity, RBAC, Private Endpoints).
Required Industry Knowledge and Competencies:
. PySpark with Delta Lake (structured APIs, MERGE, schema evolution).
. Solid knowledge of Azure Synapse Spark pools or Databricks, ADLS Gen2, and Azure SQL.
. Strong engineering discipline: observability, retries, cost and performance optimisation.
. Great Expectations (for supplementary DQ checks).
. Familiarity with ADF orchestration and containerised Spark workloads.
If you are interested, please forward a copy of your C.V in the first instance.
Cloud Consulting
Related Jobs
Principal Software Engineer - Data Product Analytics - AWS, Azure, GCP
- Surrey, Richmond, United Kingdom, TW105
Integration Project Manager - S/4HANA Implementation - 12 Months - Based in Europe
- Not Specified, Czech Republic
Senior SQL Data Analyst - Tableau expert
- London, United Kingdom
Chaos and Kraken Expert in OpenShift environment
- Bas-Rhin, France
DevOps Engineer in OpenShift environment
- Bas-Rhin, France