FreshRemote.Work

Senior Software Engineer, Data

Remote

About AssemblyAI

At AssemblyAI, we’re creating a leading Applied AI company by building powerful models to transcribe and understand audio data, exposed through a straightforward web API. 

Progress in AI is moving at an unprecedented pace–we keep our pulse on the latest developments and breakthroughs in AI research and use these advances to inform our production-ready AI models. Our Automated Speech Recognition (ASR) models already outperform companies like Google, AWS, and Microsoft - which is why hundreds of companies and thousands of developers are using our API to transcribe and understand millions of videos, podcasts, phone calls, and Zoom meetings every day.

We’ve raised funding by leading investors including Accel, Insight Partners, Y Combinator’s AI Fund, Patrick and John Collision, Nat Friedman, and Daniel Gross. As part of a huge and emerging market, AssemblyAI is well on its way to becoming the leader in applied AI.

Join our world-class, remote team and help us build an iconic AI company!

About the role:

We’re looking for a Software Engineer to join our Data Infrastructure team. This person will have an opportunity to meaningfully contribute to the vision, scope, and structure of the team and the architecture and capabilities that it builds. This person should have a strong background in Data Engineering, but also experience as a Software Engineer who understands foundational best practices like testing strategies, code reviews, etc.. They should be interested in directly leading projects that will have a material impact on the company’s ability to build and train models at scale. 

This is a cross-functional role that requires close collaboration with both our Research team and our Data Operations team, so this person should have experience working with different stakeholders and presenting information clearly to different types of audiences. 

What You’ll Do:

  • Building / contributing to Data Platforms for our Research Team i.e. managing Airflow, BigQuery, Dataproc, Dataflow, etc.
  • Building highly scalable data pipelines on distributed computing platforms on GCP
  • Contributing to building our multimedia AI Lakehouse
  • Contributing to improving our Data Lineage System
  • Building internal tooling to help other teams to visualize, use, and understand large data sets
  • Building guardrails to optimize cost, data quality, usability, and speed

What You’ll Need:

  • 5+ years of software engineering experience in production settings writing clean, maintainable, and well-tested code
  • 3+ years of professional experience working as a Data Engineer or similar position
  • Experience with BigTable, BigQuery, Dataproc, Dataflow, Dataplex, and Cloud Composer and other GCP services
  • Familiarity with distributed data processing frameworks such as Apache Beam and Apache Spark, and a deep understanding of both batch and stream processing
  • Experience with Airflow or other managed solutions such as Composer, Astronomer, etc.
  • Fluency in Python and SQL
  • Experience building internal applications and developer / researcher tools 
  • Experience with Building Data Lineage systems
  • Experience working with Terraform, Docker, Kubernetes, CI/CD
  • Knowledge of GCP IAM patterns and best practices 
  • Experience with Mage or Prefect is a plus

Pay Transparency:

AssemblyAI strives to recruit and retain exceptional talent from diverse backgrounds while ensuring pay equity for our team. Our salary ranges are based on paying competitively for our size, stage and industry, and are one part of many compensation, benefits and other reward opportunities we provide.

There are many factors that go into salary determinations, including relevant experience, skill level and qualifications assessed during the interview process, and maintaining internal equity with peers on the team. The range shared below is a general expectation for the function as posted, but we are also open to considering candidates who may be more or less experienced than outlined in the job description. In this case, we will communicate any updates in the expected salary range.

Lastly, the provided range is the expected salary for candidates in the U.S. Outside of those regions, there may be a change in the range, which again, will be communicated to candidates.

Salary range: $180k - $240k

Working at AssemblyAI

We are a small but mighty group of problem solvers, innovators, and experienced AI researchers with over 20 years of expertise in Machine Learning, Speech Recognition, and NLP. As a fully remote team, we’re looking for people to join our team who are ambitious, curious, and self-motivated. We put a lot of trust and autonomy into everyone on our team and want to find people who will add to our culture, not just fit in.

We’re committed to creating a space where our employees can bring their full selves to work and have equal opportunity to succeed. So regardless of race, gender identity or expression, sexual orientation, religion, origin, ability, age, veteran status, if joining this mission speaks to you, we encourage you to apply!

Keep Exploring AssemblyAI:

Check us out on YouTube!

Learn more about AI models for speech recognition

Core Transcription | Audio Intelligence | LeMUR | Try the Playground

Our $50M Series C fundraise

Apply

Job Profile

Skills

AI models Airflow Apache Beam Apache Spark API BigQuery CI/CD Dataflow Dataproc Docker Kubernetes Python SQL Terraform

Tasks
  • Building Data Platforms
  • Building internal tooling
  • Building scalable data pipelines
  • Contributing to multimedia AI Lakehouse
  • Improving Data Lineage System
  • Optimizing cost, data quality, usability, and speed
Experience

5+ years

Restrictions

Fully remote