What we’re looking for:
- 5+ years of experience building complex data processing applications using Python (Pandas, PySpark, or Dask).
- Advanced SQL skills for complex transformations, window functions, and query optimization in cloud warehouses.
- Deep experience with dbt (data build tool) for managing the T in ELT, including documentation and testing.
- Proven experience with Apache Airflow, Prefect, or Dagster for managing complex dependency graphs.
- Hands-on experience with Snowflake, BigQuery, or AWS Redshift.
- Strong understanding of Dimensional Modeling (Star/Snowflake schema) and Data Vault 2.0.
- Experience with Git, Docker, and implementing CI/CD for data pipelines.
Nice-to-Have:
- Experience building Real-time Pipelines using Kafka or Flink.
- Familiarity with Data Contracts and Data Quality frameworks (Great Expectations, Monte Carlo).
- Knowledge of Vector Databases (Pinecone, Milvus) for AI/LLM applications.
- Infrastructure as Code (Terraform) experience.
Responsibilities:
- Build and maintain scalable, automated ELT/ETL pipelines that provide a “single source of truth” for the organization.
- Implement rigorous automated testing and monitoring to ensure data integrity and reliability.
- Optimize warehouse storage and compute costs while reducing pipeline latency.
- Partner with Data Scientists and Product Managers to translate business requirements into technical data models.
- Promote a “DataOps” culture within the team, conducting code reviews and sharing best practices.
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#AlbionarcJobs#FintechJobs
#AsiaJobs#MiddleEastCareers
#TechTalent#FintechRecruitment
#FinanceOpportunities
