FreshRemote.Work

Backend Engineer - Ads Data Platform

San Francisco, CA

Reddit is a community of communities. It’s built on shared interests, passion, and trust and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 82M+ daily active unique visitors, Reddit is one of the internet’s largest sources of information. For more information, visit redditinc.com.

Backend Engineer - Ads Data Platform team

As a Software Engineer focussed on building data infrastructure, you will build and maintain the data infrastructure tools used by the entire Reddit Monetization Org to generate, ingest, and access petabytes of raw data, with a focus on performance and optimization. You will write scalable / fault tolerant code while collaborating with a team of top quality engineers, all while learning about and contributing to one of the most powerful streaming event pipelines in the world. You will also develop standards, and frameworks to ensure a high level of data quality to help shape the future of the data platform at Reddit Ads. Not only will your work directly impact hundreds of millions of users around the world, but your output will also shape the data infrastructure across all of Reddit!

This is a backend software development position within the Ads Organization. Ads is the fuel that powers Reddit’s mission. As a backend engineer on the Ads Data Platform team, you would work on:

  • Building large-scale data infrastructure applications, like set-up and maintain data integration tools like airflow or spark, or have experience hosting and maintaining a distributed data store like Apache Druid for the entire company, horizontally. 
  • Refine and maintain our data infrastructure technologies to support privacy safe storage and usage of data
  • Own the data pipelines that surface 65B+ daily events to all teams, and the tools we use ingestion, storage and to improve data quality.
  • Design and implement tooling for access management, monitoring, anomaly detection
  • Own data quality for crucial systems at Reddit. Define and manage SLAs for datasets that support production services.
  • Perform code reviews that improve software engineering quality.

Technologies used on the team include:

  • Languages: Scala, Go, Python, Java
  • Frameworks: Spark, Thrift, Baseplate, Kafka, Flink, Airflow
  • Datastores: Postgres, Cassandra, Druid, Redis, BigQuery
  • Tools: Kubernetes, Argo, Docker

What We're Looking For:

This job isn't fresh anymore!
Search Fresh Jobs