Senior Data Engineer
Raleigh, United States
About the Job
The Red Hat Products & Global Engineering Business Insights team is looking for a Principal Machine Learning Engineer to join us in Raleigh, NC or Boston, MA. In this role, you will serve in the Business Unit organization which spans Red Hatâs product portfolio. Youâll work closely with a team of data scientists, data engineers, and stakeholders to operationalize machine learning and generative AI models related to product development and customer experience. Some of our current production-ready models include gradient boosting, transformer-based LLM, deep learning, and logistic regression models.
What You Will Do
Collaborate with team members to develop and operationalize data science solutions, including migrating prototypes (e.g. Jupyter Notebooks) into production environments
Accelerate data workflows and model training using distributed computingÂ
Lead the end-to-end ML product lifecycle, from requirements gathering to validation, implementation, deployment, and post-deployment
Work closely with data engineering and MLOps functional roles to operationalize scalable data science solutions through automation and continuous delivery
Engage with stakeholders to improve ML systems design and application development
Establish CI/CD and unit testing practices for model development
Work closely with data scientists to determine appropriate model performance and data/concept drift monitoring
Lead and guide the team to implement MLOps best practices
What You Will Bring
Bachelor's degree in mathematics, statistics, computer science, or a related technical field, or equivalent work experience (masterâs degree a plus)
5+ years of experience developing production-grade machine learning models
2+ years of experience with Python, containers, kubernetes and CI/CDÂ
Experience querying/transforming structured/unstructured data (SQL, PySpark, Scala)
Proven experience with a deep learning framework (PyTorch, TensorFlow, Keras)
Strong theoretical knowledge and applied experience of statistics, machine learning and large language models and algorithms (unsupervised and supervised)
Demonstrated ability to produce well-documented, well-formatted code using object-oriented design principles
Proficient with CI/CD, containerized development, version control (e.g. Docker, Jenkins)
Preferred Skills
Integration of ML tracking and orchestration tools (e.g. Airflow, MLflow, Kubeflow)
Experience with deploying large language models, specifically RAG solutionsÂ
Experience with cloud-native technologies, specifically Red Hat OpenShift Container Platform
Experience with distributed computing (e.g. Dask, Ray)
Ability to fine tune deployed models to improve scalability, reliability, and performance
Pay Transparency
Red Hat determines compensation based on several factors including but not limited to job location, experience, applicable skills and training, external market value, and internal âŚ
This job isn't fresh anymore!
Search Fresh JobsJob Profile
Located in the United States Permanent associates
Benefits/PerksBonus Collaboration Commission Comprehensive medical Dental Employee Assistance Program Employee Stock Purchase Employee stock purchase plan Equity Flexible Spending Flexible Spending Account Fully remote Health savings account Inclusive environment Medical Paid parental leave Paid Time Off Parental leave Pay Transparency Remote-first company Retirement 401k Retirement 401k with employer match Tuition reimbursement Vision Vision coverage
Tasks- Best Practices
- Drive innovation
- Engage with stakeholders
- Establish ci/cd practices
- Implement mlops best practices
- Monitor model performance
- Solve complex problems
AI AI models Airflow Algorithms Application Development Automation CD CI CI/CD Cloud Collaboration Container Containers Continuous delivery Customer Experience Dask Data engineering Data Science Deep Learning Deployment Distributed computing Docker Drift Generative AI Healthcare Innovation IT Jenkins Jupyter notebooks Keras Kubeflow Kubernetes Large Language Models Linux Machine Learning Machine Learning Models Make Mathematics ML MLFlow MLOps Model Development Models Model training Monitoring Object-oriented design OpenShift Open Source Open Source Principles Open Source Software Orchestration Policy Product Development PySpark Python PyTorch Ray Red Hat Red Hat OpenShift Red hat products Requirements Gathering Scala Software Solutions SQL Statistics Technical TensorFlow Testing Training Unit Testing Version Control
Experience5 years
EducationAI Bachelor's Bachelor's degree Business Computer Science Data Science Degree Engineering Equivalent Equivalent work experience IT Master's degree Related technical field Relevant Work Experience Technical field
TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9