Senior Data Engineer, Capacity Systems & Analytics

Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA

CoreWeave

Published 1 month ago

Hey, this job isn't fresh anymore! 👉 Find fresh remote jobs here

CoreWeave is the AI Hyperscaler™, delivering a cloud platform of cutting edge services powering the next wave of AI. Our technology provides enterprises and leading AI labs with the most performant, efficient and resilient solutions for accelerated computing. Since 2017, CoreWeave has operated a growing footprint of data centers covering every region of the US and across Europe. CoreWeave was ranked as one of the TIME100 most influential companies of 2024.

As the leader in the industry, we thrive in an environment where adaptability and resilience are key. Our culture offers career-defining opportunities for those who excel amid change and challenge. If you’re someone who thrives in a dynamic environment, enjoys solving complex problems, and is eager to make a significant impact, CoreWeave is the place for you. Join us, and be part of a team solving some of the most exciting challenges in the industry.

CoreWeave powers the creation and delivery of the intelligence that drives innovation.

About the Role:

We’re seeking a skilled Senior Data Engineer to lead the development of foundational data models that empower our Business Intelligence Engineers, analysts, and data scientists to efficiently work with and gain insights from our capacity and supply chain data. This role will own the creation and maintenance of star and snowflake schemas within our lakehouse environment and set the standards for dimensional modeling best practices. The engineer will also create and optimize key datasets and metrics essential to tracking business health.

Responsibilities:

Develop and maintain data models, including star and snowflake schemas, to support analytical needs across the organization.
Establish and enforce best practices for dimensional modeling in our Lakehouse.
Engineer and optimize data storage using analytical table/file formats (e.g., Iceberg, Parquet, Avro, ORC).
Partner with BI, analytics, and data science teams to design datasets that accurately reflect business metrics.
Tune and optimize data in MPP databases such as StarRocks, Snowflake, BigQuery, or Redshift.
Collaborate on data workflows using Airflow, building and managing pipelines that power our analytical infrastructure.
Ensure efficient processing of large datasets through distributed computing frameworks like Spark or Flink.

Qualifications:

You thrive in a fast-paced, complex, work environment and you love tackling hard problems.
Hands-on experience applying Kimball Dimensional Data Modeling principles to large datasets.
Expertise in working with analytical table/file formats, including Iceberg, Parquet, Avro, and …