We are looking for a skilled Data Engineer to join our Data Platform team and help build scalable, reliable, and secure data infrastructure. The ideal candidate will be responsible for designing and developing distributed data pipelines, working with large-scale datasets and enabling downstream analytics and product use-cases through a modern lakehouse architecture.
You will work closely with backend, analytics, and platform teams to build high-performance ETL pipelines and optimize data workflows across batch and near real-time systems.

Key Responsibilities
• Design, develop, and maintain scalable batch data pipelines using Apache Spark
• Build and manage ETL workflows on Databricks or equivalent distributed compute platforms
• Work with data lake storage formats such as Delta Lake / Apache Iceberg
• Implement efficient data ingestion, transformation, and processing pipelines
• Query and process large datasets using distributed SQL engines such as Trino
• Work with columnar file formats such as Parquet for optimized storage and processing
• Ensure data quality, consistency, and reliability across ingestion and transformation layers
• Implement best practices for handling PII and sensitive data in compliance with security and governance standards
• Optimize data processing jobs for performance and cost efficiency
• Collaborate with cross-functional teams to understand data requirements and enable data-driven product features
• Monitor, troubleshoot, and improve pipeline reliability and performance

Profile:
• 3-4 years of experience in Data Engineering or related roles
• Strong hands-on experience with Apache Spark (PySpark / Scala / SQL)
• Experience working with Databricks or similar distributed data platforms
• Experience with Trino / Presto for distributed querying
• Solid understanding of ETL/ELT concepts and data pipeline design
• Experience working with data lake architectures
• Hands-on experience with Delta Lake / Apache Iceberg
• Strong experience working with Parquet or other columnar storage formats
• Experience working with large-scale structured/semi-structured datasets
• Familiarity with handling and processing PII / sensitive data securely
• Understanding of data partitioning, schema evolution, and performance tuning

Good to Have
• Experience with workflow orchestration tools (Airflow, Dagster, etc.)
• Familiarity with AWS/GCP/Azure data ecosystem
• Experience with streaming frameworks such as Kafka
• Knowledge of data governance and access control mechanisms
• Exposure to lakehouse architecture patterns

Education
Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent practical experience)

Data Engineer

Submit Your Application