Data Engineer with Airflow
Remote job
Hello, let’s meet!
We are Xebia - a place where experts grow. For nearly two decades now, we've been developing digital solutions for clients from many industries and places across the globe. Among the brands we’ve worked with are UPS, McLaren, Aviva, Deloitte, and many, many more.
We're passionate about Cloud-based solutions. So much so, that we have a partnership with three of the largest Cloud providers in the business – Amazon Web Services (AWS), Microsoft Azure & Google Cloud Platform (GCP). We even became the first AWS Premier Consulting Partner in Poland.
Formerly we were known as PGS Software. In 2021, we joined Xebia Group – a family of interlinked companies driven by the desire to make a difference in the world of technology.
Xebia stands for innovation, talented team members, and technological excellence. Xebia means worldwide recognition, and thought leadership. This regularly provides us with the opportunity to work on global, innovative projects.
Our mission can be captured in one word: Authority. We want to be recognized as the authority in our field of expertise.
What makes us stand out? It's the little details, like our attitude, dedication to knowledge, and the belief in people's potential - emphasizing every team members development. Obviously, these things are not easy to present on paper – so make sure to visit us to see it with your own eyes!
Now, we've talked a lot about ourselves – but we'd love to hear more about you.
Send us your resume to start the conversation and join the #Xebia.
About the role:
As a Data Engineer at Xebia, you will work closely with engineering, product, and data teams to deliver our clients scalable and robust data solutions. Your key responsibilities will include designing, building, and maintaining data platforms and pipelines and mentoring new engineers.
You will be:
developing and maintaining data pipelines to ensure seamless data flows,
ensuring data integrity, consistency, and availability across all data systems,
integrating data from various sources, including transactional databases, third-party APIs, and external data sources, into the data lake,
implementing ETL processes to transform and load data into the data warehouse for analytics and reporting,
working closely with cross-functional teams including Engineering, Business Analytics, Data Science and Product Management to understand data requirements and deliver solutions,
collaborating with data engineers to ensure data engineering best practices are integrated into the development process,
optimizing data storage and retrieval to improve performance and scalability,
monitoring and troubleshooting data pipelines to ensure high reliability and efficiency,
implementing and enforcing data governance policies to ensure data security, privacy, and compliance,
developing documentation and standards for data processes and procedures.
Requirements
Your profile:
3+ years in a data engineering role, with hands-on experience in building data processing pipelines,
proficiency with Python,
proficiency with SQL (large joins, window functions),
extensive experience with Apache Airflow, including DAG creation, triggers, and workflow optimization,
knowledge of data partitioning, batch configuration, and performance tuning for terabyte-scale processing,
hands-on experience with modern data libraries and frameworks (e.g., Databricks, Snowflake, Spark),
hands-on experience with ETL tools and processes,
deep understanding of relational and NoSQL databases, data modelling, and data warehousing concepts,
excellent command of oral and written English,
available to start within a short time frame (a maximum one month’s notice),
Bachelor's or Master’s degree in Computer Science, Information Systems, or a related field.
Work from the European Union region and a work permit are required.
Nice to have:
Terraform,
GitHub Actions,
AWS/Azure/GCP.
Recruitment Process:
CV review – HR call – Technical Interview (with live-coding elements) – Client Interview (live-coding)– Decision
ApplyJob Profile
RestrictionsWork from the European Union region
Benefits/PerksInnovative projects Professional development Remote work flexibility
Tasks- Collaborate with cross functional teams
- Data processing
- Design and build data platforms
- Develop documentation for data processes
- Documentation
- Enforce data governance policies
- Implement ETL processes
- Integrate data from various sources
- Maintain data pipelines
- Mentor new engineers
- Monitor and troubleshoot data pipelines
- Optimize data storage and retrieval
- Troubleshooting
Airflow Apache Airflow AWS Azure Databricks Data engineering Data Governance Data Modeling Data processing Data Warehousing Documentation English ETL GCP GitHub GitHub Actions Google Cloud Google Cloud Platform Microsoft Azure NoSQL Python Snowflake Spark SQL Terraform Troubleshooting Word
Experience3 years
EducationBachelor's Computer Science Engineering Master's Related Field