Essential
You will have:
• Experience working across distributed processing, traditional RDBMS, MPP and NoSQL database technologies.
• Strong background with ETL and data warehousing tools such as Informatica, Talend, Pentaho or DataStage.
• Hands‑on experience with Hadoop, Spark, Storm, Impala and related platforms.
• Strong understanding of RDBMS concepts, ETL principles and end‑to‑end data pipeline development.
• Solid knowledge of data modelling techniques (ERDs, star schema, snowflake schema).
• Experience with AWS services including S3, EC2, EMR, RDS, Redshift and Kinesis.
• Exposure to distributed processing (Spark, Hadoop, EMR), RDBMS (SQL Server, Oracle, MySQL, PostgreSQL), MPP (Redshift, Teradata) and NoSQL technologies (MongoDB, DynamoDB, Cassandra, Neo4J, Titan).
• Experience designing and building streaming pipelines using tools such as Kafka, Kafka Streams or Spark Streaming.
• Strong proficiency in Python and at least two of: Scala, SQL or Java.
• Experience deploying production applications, including testing, packaging, monitoring and release management.
• Proficiency with Git‑based source control and CI/CD pipelines, ideally GitLab.
• Strong engineering discipline including code reviews, testing frameworks and maintainable coding practices.
• Master’s degree in Computer Science, MIS, Engineering or a related field.
• At least 10 years’ experience in Data Engineering or Architecture.
• Experience working within DevOps, Agile, Scrum or Continuous Delivery environments.
• Ability to mentor team members and support capability development across teams.
• Strong communication, listening and influencing skills.
• High levels of motivation, adaptability and problem‑solving capability.
Preferred
• Experience with structured, semi‑structured and unstructured data.
• Understanding of data governance, lineage and data quality approaches.
• Experience with Infrastructure‑as‑Code tools such as Terraform.
• Exposure to workflow orchestration tools like Azkaban, Luigi or Airflow.
• Experience enabling data consumption through APIs, event streams or data marts.
• Experience with MuleSoft, Solace or StreamSets.
Bonus
• Experience in the mining or resources sector.