NLP Data Scientist
Remote job
Hello, let’s meet!
We are Xebia - a place where experts grow. For nearly two decades now, we've been developing digital solutions for clients from many industries and places across the globe. Among the brands we’ve worked with are UPS, McLaren, Aviva, Deloitte, and many, many more.
We're passionate about Cloud-based solutions. So much so, that we have a partnership with three of the largest Cloud providers in the business – Amazon Web Services (AWS), Microsoft Azure & Google Cloud Platform (GCP). We even became the first AWS Premier Consulting Partner in Poland.
Formerly we were known as PGS Software. In 2021, we joined Xebia Group – a family of interlinked companies driven by the desire to make a difference in the world of technology.
Xebia stands for innovation, talented team members, and technological excellence. Xebia means worldwide recognition, and thought leadership. This regularly provides us with the opportunity to work on global, innovative projects.
Our mission can be captured in one word: Authority. We want to be recognized as the authority in our field of expertise.
What makes us stand out? It's the little details, like our attitude, dedication to knowledge, and the belief in people's potential - emphasizing every team members development. Obviously, these things are not easy to present on paper – so make sure to visit us to see it with your own eyes!
Now, we've talked a lot about ourselves – but we'd love to hear more about you.
Send us your resume to start the conversation and join the #Xebia.
You will be:
- working with data scientists and analysts to create and deploy new product features on the e-commerce website, in-store portals, and clients’ mobile apps,
- establishing scalable, efficient, automated processes for data analysis, model development, validation, and implementation,
- writing efficient and scalable software to ship products in an iterative, continual-release environment,
- contributing to and promoting good software engineering practices across the team and building cloud-native software for ML pipelines,
- contributing to and reusing community best practices,
- working on Working on Generative AI solutions to augment the CDD (Customer Due Diligence) review process,
- developing a risk summarization for at the end of a CDD Review.
Requirements
Your profile:
- 8+ years of experience as a data engineer or software developer,
- 4+ years of experience developing and deploying machine learning systems into production,
- domain expertise relevant for retails banking, wholesale banking, tech, COO domains (e.g. financial crime and contact centers) and for building analytics platforms & data products,
- experience in prompt engineering, Agentic AI, RAG, information retrieval, LLM,
- expertise in evaluation, NLU, LLM inference tuning, LLM fine-tuning,
- knowledge of LLM, RAGs, prompt engineering, and productionizing LLM applications,
- familiarity with MLOps architecture and practices,
- strong programming skills like Python,
- expertise in public cloud (preferably GCP),
- proficiency in managed GCP services (GKE, GCS, BQ, Dataproc, Dataflow), Cloud Storage, Cloud Run, Vertex AI suite (model garden, experiment, pipelines, etc.), BigQuery, and CI/CD steps and tooling such as Cloud Build and Artifact Registry,
- knowledge of public Cloud Analytics,
- relevant experience in sklearn, MLFLow, TensorFLow,
- very good verbal and written communication skills in English.
Work from the European Union region and a work permit are required.
Nice to have:
- experience in Java, Scala, or Go,
- knowledge of AWS,
- experience in Kubernetes,
- expertise in Apache Airflow.
Recruitment Process:
CV review – HR call – Interview – Client Interview – Decision
Job Profile
RestrictionsWork from the European Union region Work from the European Union region required Work permit required
Benefits/PerksInnovative projects Professional development Remote work
Tasks- Build cloud-native ML pipelines
- Develop and deploy product features
- Develop risk summarization
- Establish automated data analysis processes
- Promote software engineering practices
- Work on generative AI solutions
- Write scalable software
Airflow Apache Airflow AWS Azure BigQuery CI/CD Cloud Analytics Cloud Computing Communication Data analysis Data Science English GCP Generative AI Go Google Cloud Google Cloud Platform Information Retrieval Java Kubernetes LLM Machine Learning Microsoft Azure ML MLFlow MLOps NLP Prompt Engineering Python Risk Summarization Scala Sklearn Software Engineering TensorFlow Vertex AI Word
Experience8 years
Education