FreshRemote.Work

Sr. Data Engineer

Remote - United States

About BridgeBio

BridgeBio is a biopharmaceutical company founded to discover, create, test, and deliver transformative medicines to treat patients who suffer from genetic diseases and cancers with clear genetic drivers. We bridge the gap between remarkable advancements in genetic science in academic institutions and the delivery of meaningful medicines to patients. Founded in 2015, the company has built a portfolio of 20+ drug development programs ranging from preclinical to late-stage development in multiple therapeutic areas including genetic dermatology, precision oncology, cardiology, endocrinology, neurology, pulmonology, and renal disease, with two approved drugs.

Our focus on scientific excellence and rapid execution aim to translate today’s discoveries into tomorrow’s medicines. We have U.S. offices in San Francisco, Palo Alto, and Raleigh, with small satellites in other parts of the country. We also have international offices in Montreal, Canada, and Zurich, Switzerland, and are expanding across Europe.

To learn more about our story and company culture, visit us at https://bridgebio.com

Who You Are

The Data Science and Operations team is seeking a full-time Senior Data Engineer to build and maintain critical data and compute infrastructure to support our drug discovery and development efforts. This role will play a key part in building scalable data pipelines, optimizing cloud-based storage and processing, and ensuring data accessibility for data science and machine learning applications. 

As part of the Computational Genomics group at BridgeBio, the Data Science and Operations team is dedicated to three key objectives:   

  1. Discovering new opportunities for drug development through the analysis of human genetic data. 
  2. Providing data science and bioinformatics support to core program affiliates.  
  3. Building a data platform to facilitate data-driven decision-making in internal drug development. 

To achieve these goals, the team designs, develops, maintains, and operates software tools and data processing systems, enabling the analysis of scientific and business data for insightful discoveries. 

Responsibilities

  • Architect & Develop Scalable Data Infrastructure: Design and implement robust, secure, and scalable data pipelines and infrastructure on AWS using EC2, S3, Athena, EKS, and other cloud-native services
  • Optimize Data Processing: Leverage Apache Spark and Databricks to process large-scale datasets efficiently for analytics, reporting, and machine learning applications
  • Automate Data Workflows: Build and maintain orchestration workflows using Apache Airflow to automate data pipelines
  • Monitor & Optimize Performance: Continuously improve system reliability, performance, and cost efficiency through monitoring, logging, and infrastructure optimization
  • Collaborate with Cross-Functional Teams: Work closely with computational biologists, experimental scientists, and colleagues in business development to provide accessible and high-quality data solutions

No matter your role at BridgeBio, successful team members are:

  • Patient Champions, who put patients first and uphold strict ethical standards
  • Entrepreneurial Operators, who drive toward practical solutions and have an ownership mindset
  • Truth Seekers, who are detailed, rational, and humble problem solvers
  • Individuals Who Inspire Excellence in themselves and those around them
  • High-quality executors, who execute against goals and milestones with quality, precision, and speed

Education, Experience & Skills Requirements

  • Minimum Education requirement 
    • Master’s degree or higher in Computer Science, Data Engineering, Information Systems, or a related technical field. 
  • Relevant Experience  
    • 5+ years of experience as a Data Engineer, DevOps engineer or similar role.  
  • Skills 
    • Expert knowledge with AWS cloud computing including hands on experience with the following services: 
    • EC2, S3 
    • Athena 
    • Elastic Kubernetes service  
    • Elastic Container Registry 
    • Strong proficiency in Python, which includes 
    • Developing stand-alone libraries 
    • Developing and deploying automated ETL pipelines  
    • Knowledge of at least one testing suit 
    • Performance optimization 
    • Hands-on experience with at least one modern data platform 
    • Databricks 
    • Snowflake 
    • Expertise in Apache Spark 
    • Spark SQL, DataFrames, and PySpark. 
    • Knowledge of relational databases 
    • Version control with git, which includes 
    • Setting up and managing remote repositories, implementing proper branch management, resolving merge conflicts locally 
    • Collaborating on remote repository: working with protected branches, submitting and resolving pull requests, adding automated tests with github actions. 
  • Any experience with the following is a plus: 
    • Human genetics data.  
    • Familiar with the drug development process and pharmaceutical industry. 
    • Data visualization & dashboarding solutions such as Metabase, plotly Dash.   
    • Experience with the UK Biobank, All of Us Research Program. 

What We Offer

  • Patient Days, where we are fortunate to hear directly from individuals living with the conditions we are seeking to impact throughout the year and learn how we can improve our efforts
  • A culture inspired by our values: put patients first, think independently, be radically transparent, every minute counts, and let the science speak
  • An unyielding commitment to always putting patients first. Learn more about how we do this here
  • A de-centralized model that enables our program teams to focus on advancing science and helping patients. Our affiliate structure is designed to eliminate bureaucracy and put decision-making power in the hands of those closest to the science
  • A place where you own the vision – both for your program and your own career path
  • A collaborative, fast-paced, data-driven environment where we inspire ourselves and each other to always perform at the top of our game
  • Access to learning and development resources to help you get in the best professional shape of your life
  • Robust and market-competitive compensation & benefits package (Base, Performance Bonus, Equity, health, welfare & retirement programs)
  • Flexible PTO
  • Rapid career advancement for strong performers
  • Potential ability to work on multiple BridgeBio Pharma programs across multiple therapeutic areas over time 
  • Partnerships with leading institutions
  • Commitment to Diversity, Equity & Inclusion
At BridgeBio, we strive to provide a market-competitive total rewards package, including base pay, an annual performance bonus, company equity, and generous health benefits. Below is the anticipated salary range for candidates for this role who will work in California. The final salary offered to a successful candidate will depend on several factors that may include but are not limited to the type and length of experience within the job, type, and length of experience within the industry, educational background, location of residence and performance during the interview process. BridgeBio is a multi-state employer, and this salary range may not reflect positions based in other states. Salary$170,000—$215,000 USD Apply

Job Profile

Regions

North America

Countries

United States

Tasks
  • Automate data workflows
  • Build and maintain data infrastructure
  • Collaborate with teams
  • Monitor system performance
  • Optimize data processing
Skills

Apache Airflow Apache Spark Athena AWS Bioinformatics Databricks Data engineering Data processing Data Science EC2 EKS Genomics Machine Learning Pharmaceutical S3 Software tools Version Control Visualization

Experience

5 years

Education

DO Master's degree Ph.D.

Timezones

America/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9