FreshRemote.Work

Principal Software Engineering, AI Model Serving

Boston

Job Summary:

Are you ready to join a game-changing open-source AI platform that harnesses the power of hybrid cloud to drive innovation?

The Red Hat OpenShift AI (RHOAI) team is looking for a Principal Software Engineer with Kubernetes and MLOps (Machine Learning) experience to join our rapidly growing engineering team. Our focus is to create a platform, partner ecosystem, and community by which enterprise customers can solve problems to accelerate business success using AI. This is a very exciting opportunity to build and impact the next generation of hybrid cloud MLOps platforms, contribute to the development of the RHOAI product, participate in open-source communities, and be at the forefront of the exciting evolution of AI. You’ll join an ecosystem that fosters continuous learning, career growth, and professional development. 

In this role, you'll be contributing as a model serving and monitoring subject matter expert for the model serving features of the open-source Open Data Hub project by actively participating in KServe, TrustyAI, Kubeflow, HuggingFace, vLLM and several other open-source communities. You will work as part of an evolving development team to rapidly design, secure, build, test, and release model serving, trustworthy AI, and model registry capabilities. The role is primarily an individual contributor who will be a key notable contributor to MLOps upstream communities and collaborate closely with the internal cross-functional development teams. 

What you will do:

  • Be an influencer and leader in MLOps-related open source communities to help build an active MLOps open source ecosystem for Open Data Hub and OpenShift AI

  • Act as an MLOps SME within Red Hat by supporting customer-facing discussions, presenting at technical conferences, and evangelizing OpenShift AI within the internal community of practices

  • Architect and design new features for open-source MLOps communities such as KubeFlow and KServe

  • Provide technical vision and leadership on critical and high-impact projects 

  • Mentor, influence, and coach a team of distributed engineers

  • Ensure non-functional requirements including security, resiliency, and maintainability are met

  • Write unit and integration tests and work with quality engineers to ensure product quality

  • Use CI/CD best practices to deliver solutions as productization efforts into RHOAI

  • Contribute to a culture of continuous improvement by sharing recommendations and technical knowledge with team members

  • Collaborate with product management, other engineering, and cross-functional teams to analyze and clarify business requirements

  • Communicate effectively to stakeholders and team members to ensure proper visibility of development efforts

  • Give thoughtful and prompt code reviews

  • Represent RHOAI in external engagements including industry events, customer meetings, and open-source communities

What you will bring

  • An existing contributor in one or more MLOps open source projects such as KubeFlow, KServe, RayServe, and vLLM.

  • Recent hands-on experience in deploying and maintaining machine learning models in production environments

  • Passion for writing and maintaining reliable code

  • Experience with monitoring and alerting tools such as Prometheus and Grafana

  • Excellent written and verbal communication skills; fluent English language skills

  • Advanced experience developing applications in Python and Go

  • Advanced level of experience in Kubernetes or OpenShift 

  • Ability to quickly learn and guide others on using new tools and technologies

  • Experience  with source code management tools such as Git

  • Proven ability to innovate and a passion for staying at the forefront of technology.

  • Excellent system understanding and troubleshooting capabilities

  • Autonomous work ethic, thriving in a dynamic, fast-paced environment.

  • Technical leadership acumen in a global team environment

The following will be considered a plus: 

  • Bachelor's degree in statistics, mathematics, computer science, operations research, or a related quantitative field, or equivalent expertise; Master’s or PhD is a big plus

  • Understanding of how Open Source and Free Software communities work

  • Experience with development for public cloud services (AWS, GCE, Azure)

  • Experience in engineering, consulting or another field related to model serving and monitoring, model registry, explainable AI, deep neural networks, in a customer environment or supporting a data science team

  • Highly experienced in OpenShift

  • Familiarity with popular Python machine learning libraries such as PyTorch, Tensorflow, and Hugging Face

#LI-MD2

The salary range for this position is $163,420.00 - $269,640.00. Actual offer will be based on your qualifications.

Pay Transparency

Red Hat determines compensation based on several factors including but not limited to job location, experience, applicable skills and training, external market value, and internal pay equity. Annual salary is one component of Red Hat’s compensation package. This position may also be eligible for bonus, commission, and/or equity. For positions with Remote-US locations, the actual salary range for the position may differ based on location but will be commensurate with job duties and relevant work experience. 

About Red Hat

Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.

Benefits
●    Comprehensive medical, dental, and vision coverage
●    Flexible Spending Account - healthcare and dependent care
●    Health Savings Account - high deductible medical plan
●    Retirement 401(k) with employer match
●    Paid time off and holidays
●    Paid parental leave plans for all new parents
●    Leave benefits including disability, paid family medical leave, and paid military leave
●    Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more! 

Note: These benefits are only applicable to full time, permanent associates at Red Hat located in the United States. 

Diversity, Equity & Inclusion at Red Hat
Red Hat’s culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from diverse backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions of diversity that compose our global village.

Equal Opportunity Policy (EEO)
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.


Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.

Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email application-assistance@redhat.com. General inquiries, such as those regarding the status of a job application, will not receive a reply. 

Apply

Job Profile

Regions

North America

Countries

United States

Restrictions

Located in the United States

Benefits/Perks

Bonus Career growth Collaboration Commission Comprehensive medical Continuous learning Dental Employee Assistance Program Employee Stock Purchase Employee stock purchase plan Equity Flexible Spending Flexible Spending Account Fully remote Health savings account Inclusive environment Medical Paid parental leave Paid Time Off Parental leave Pay Transparency Professional development Remote-first company Retirement 401k Retirement 401k with employer match Tuition reimbursement Vision Vision coverage

Tasks
  • Architect new features
  • Best Practices
  • Collaborate with cross functional teams
  • Collaborate with product management
  • Deliver solutions
  • Ensure security and maintainability
  • Lead MLOps in open-source communities
  • Mentor distributed engineers
  • Troubleshooting
  • Write unit and integration tests
Skills

AI AWS Azure CI CI/CD Cloud Cloud Services Code reviews Collaboration Communication Consulting Container Continuous Improvement Data Science Explainable AI Functional Requirements GCE Git Go Grafana Healthcare Hugging Face HuggingFace Hybrid Cloud IT KServe Kubeflow Kubernetes Leadership Linux Machine Learning Make Management Tools Mathematics MLOps Models Model serving Monitoring Neural Networks OpenShift Open Source Open Source Principles Open source projects Open Source Software Policy Product Management Prometheus Public Cloud Python PyTorch Red Hat Red Hat OpenShift Research Security Software Engineering Software Solutions Source Code Management Statistics Technical Technical Leadership TensorFlow Training Troubleshooting TrustyAI Verbal communication VLLM

Education

AI Bachelor's Bachelor's degree Business Computer Science Data Science Degree Engineering Equivalent IT Management Quantitative field Relevant Work Experience Software Engineering

Certifications

Product management

Timezones

America/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9