FreshRemote.Work

Senior Software Engineer, Data

Remote

About AssemblyAI

At AssemblyAI, we’re creating a leading Applied AI company by building powerful models to transcribe and understand audio data, exposed through a straightforward web API. 

Progress in AI is moving at an unprecedented pace–we keep our pulse on the latest developments and breakthroughs in AI research and use these advances to inform our production-ready AI models. Our Automated Speech Recognition (ASR) models already outperform companies like Google, AWS, and Microsoft - which is why hundreds of companies and thousands of developers are using our API to transcribe and understand millions of videos, podcasts, phone calls, and Zoom meetings every day.

We’ve raised funding by leading investors including Accel, Insight Partners, Y Combinator’s AI Fund, Patrick and John Collision, Nat Friedman, and Daniel Gross. As part of a huge and emerging market, AssemblyAI is well on its way to becoming the leader in applied AI.

Join our world-class, remote team and help us build an iconic AI company!

About the role:

We’re looking for a Software Engineer to join our Data Infrastructure team. This person will have an opportunity to meaningfully contribute to the vision, scope, and structure of the team and the architecture and capabilities that it builds. This person should have a strong background in Data Engineering, but also experience as a Software Engineer who understands foundational best practices like testing strategies, code reviews, etc.. They should be interested in directly leading projects that will have a material impact on the company’s ability to build and train models at scale. 

This is a cross-functional role that requires close collaboration with both our Research team and our Data Operations team, so this person should have experience working with different stakeholders and presenting information clearly to different types of audiences. 

What You’ll Do:

  • Building / contributing to Data Platforms for our Research Team i.e. managing Airflow, BigQuery, Dataproc, Dataflow, etc.
  • Building highly scalable data pipelines on distributed computing platforms on GCP
  • Contributing to building our multimedia AI Lakehouse
  • Contributing to improving our Data Lineage System
  • Building internal tooling to help other teams to visualize, use, and understand large data sets
  • Building guardrails to optimize cost, data quality, usability, and speed

What You’ll Need:

  • 5+ years of software engineering experience in production settings writing clean, maintainable, and well-tested code
  • 3+ years of professional experience working as a Data Engineer or similar position
  • Experience with BigTable, BigQuery, Dataproc, Dataflow, Dataplex, and …
This job isn't fresh anymore!
Search Fresh Jobs

Job Profile

Restrictions

Fully remote

Skills

AI models Airflow Apache Beam Apache Spark API BigQuery CI/CD Docker Kubernetes Python SQL Terraform

Experience

5+ years