DevOps & Platform Support Engineer
USA Remote
Credo AI is a venture-backed company on a mission to empower organizations to responsibly build, adopt, procure and use AI at scale. Credo AI has built a pioneering platform for context-driven AI governance, AI risk assessment and compliance (to regulations like the EU AI Act and standards like NIST AI RMF, ISO 42001 etc) to ensure compliant, fair, and auditable development and use of AI. Our goal is to move responsible AI development from an “ethical” choice to an obvious one-by ensuring AI’s benefits are universally accessible while addressing the full spectrum of its risks. We aim to do this both by making it easier for organizations to integrate responsible AI Governance practices into their AI development and by collaborating with regulators/policymakers to set up appropriate ecosystem incentives. Founded in 2020, Credo AI has been recognized as a one of the Most Innovative Companies of 2024 by Fast Company, a Technology Pioneer by the World Economic Forum, named to the CBInsights' AI 100 List and World's Most Promising Startups list, and included in Fast Company’s Next Big Thing in Tech and Intelligent Applications Top 40 by Madrona, Goldman Sachs, Microsoft and Pitchbook.
About the Role
What we are looking for:
As a DevOps & Platform Support Engineer at Credo AI, you will play a crucial role in driving the success of our customer implementations. In this role, you will work with the Customer Success team to ensure smooth customer installations and implementations, providing technical support and project management assistance.
In this role, you will serve as a technical expert and trusted advisor, bridging the gap between our product capabilities and the customer's unique challenges. Your ability to communicate complex technical concepts clearly and concisely will be instrumental in conveying the value proposition of our AI governance software to a diverse range of stakeholders. Through your work, you will drive the adoption of Responsible AI at scale, ensuring organizations can leverage the power of artificial intelligence while maintaining trust and compliance with applicable regulations and guidelines.
You might be a good fit if if you have experience in the following areas:
Kubernetes and Container Orchestration
Proficiency in deploying applications to Kubernetes clusters (K8s), including understanding of Kubernetes Operators, Services, Pods, and other core K8s concepts.
Containerized Applications: Knowledge of container runtime environments (e.g., containerd) and container registries. Experience with Docker and understanding of containerization principles.
Networking in Kubernetes: Understanding of Kubernetes networking, ingress controllers (e.g., AWS ALB, nginx), and Service meshes. Ability to manage and troubleshoot Kubernetes networking, including load balancing and DNS configurations.
Cloud Services Management: Experience with cloud services, specifically AWS, including S3, RDS for PostgreSQL, and IAM roles. Knowledge of cloud security best practices and implementation.
Security and Compliance
TLS/SSL Management: Ability to manage TLS private keys and certificates for securing applications.
Network Security: Proficiency in configuring firewall rules and ensuring secure access to the application. Experience with SMTP configurations and StartTLS for email services.
Database and Storage
Database Administration: Experience with PostgreSQL, including setting up, managing, and optimizing databases. Knowledge of database security best practices, especially for cloud-based solutions like AWS RDS.
Object Storage: Understanding of AWS S3 and S3-compliant object storage, including bucket management and security best practices.
DevOps and Automation
Scripting and Automation: Proficiency in scripting languages (e.g., Bash, Python) for automation of installation and maintenance tasks.
Continuous Integration/Continuous Deployment (CI/CD): Experience with CI/CD pipelines and tools for automated deployment and updates.
System Administration
Operating Systems: Knowledge of Linux operating systems, system administration, and the ability to troubleshoot OS-level issues. Familiarity with SELinux and its configuration.
Network Configuration: Understanding of network configurations, DNS management, and IP addressing.
Identity and Access Management
Single Sign-On (SSO): Experience with OIDC providers for SSO integration. Understanding of identity protocols and secure authentication mechanisms.
Monitoring and Support
Application Monitoring: Knowledge of monitoring tools and techniques to ensure application performance and availability.
Troubleshooting: Ability to diagnose and resolve issues during installation, deployment, and operation phases. Effective problem-solving skills for both infrastructure and application-level issues.
Expectations:
Manage the complete deployment lifecycle, from identifying the customer’s deployment lead and team to creating and executing the deployment project plan alongside the customer to managing and supporting the installation and setup of the Credo AI Platform, ensuring a smooth and successful launch
Manage and support post-deployment activities, including software releases, customizations, monitoring, and maintenance, ensuring the system remains optimal and up-to-date
Utilize strong communication skills to effectively convey deployment-related information and confidently lead customer discussions on deployment and releases
Support ongoing implementation, custom deployment works, configuration updates, and management of the Platform to meet business requirements
Troubleshoot issues and provide technical support to ensure high availability and reliability
Coordinate Platform updates and upgrades as well as the customer-facing deployment instructions and documentation
Design and implement a repeatable standard process for new customer deployments across the board, designed to optimize efficiency, quality, and self-sufficiency while also adapting to the unique characteristics and needs of each customer
Compensation
The expected base salary range for this position is $150,000 - $170,000. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position in the specified location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.
Location & Remote Culture
While this is a remote role and we're a fully distributed team, we routinely meet up in-person. We support individual members to coordinate in-person coworking whenever possible, and organize company-wide offsites multiple times a year. At Credo AI we value diversity, equity, and inclusion as core principles in our work environment, and the development of our product offerings, and we have implemented initiatives to foster and support these values.
Credo AI Benefits & Perks
Competitive Salary and Equity
Health: We offer health, dental, and vision coverage. We also offer an ergonomic benefit to cover the costs of equipment to help staff stay healthy while working, both in the office and at home.
Coworking: We will cover the cost of co-working spaces like WeWork and in-person meetups.
Unlimited PTO: Credo AI has unlimited time off to support our employees
Generous Parental Leave: We offer up to 12 weeks of paid parental leave.
401(k) plan for employees (US only)
Job Profile
Company-wide offsites Competitive salary Competitive salary and equity Equity Ergonomic benefit Health, dental, and vision coverage
Tasks- Communicating technical concepts
- Customer installations
- Driving AI adoption
- Project management assistance
- Technical Support
AI Governance AI risk assessment Automation AWS CI/CD Cloud Security Communication Compliance Container Orchestration Continuous Deployment Continuous Integration Database Administration DevOps Docker Identity and Access Management Ingress Controllers Kubernetes Linux Network Configuration Networking Network security Object Storage PostgreSQL Project Management Scripting Security and Compliance Security Best Practices Service Meshes System Administration TLS/SSL Management
TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9