FreshRemote.Work

Data Engineer

Nevada

The CDC Foundation helps the Centers for Disease Control and Prevention (CDC) save and improve lives by unleashing the power of collaboration between CDC, philanthropies, corporations, organizations and individuals to protect the health, safety and security of America and the world. The CDC Foundation is the go-to nonprofit authorized by Congress to mobilize philanthropic partners and private-sector resources to support CDC’s critical health protection mission. Since 1995, the CDC Foundation has raised over $1.9 billion and launched more than 1,300 programs impacting a variety of health threats from chronic disease conditions including cardiovascular disease and cancer, to infectious diseases like rotavirus and HIV, to emergency responses, including COVID-19 and Ebola. The CDC Foundation managed hundreds of programs in the United States and in more than 90 countries last year. Visit www.cdcfoundation.org for more information.  
Job HighlightsLocation: Remote, must be based in the United StatesSalary Range: $103,500-$143,500 per year, plus benefits. Individual salary offers will be based on experience and qualifications unique to each candidate.Position Type: Grant funded, limited-term opportunityPosition End Date: June 30, 2025 
OverviewThe Data Engineer will play a crucial role in advancing the CDC Foundation's mission by designing, building, and maintaining data infrastructure for a public health organization. This role is aligned to the Workforce Acceleration Initiative (WAI). WAI is a federally funded CDC Foundation program with the goal of helping the nation’s public health agencies by providing them with the technology and data experts they need to accelerate their information system improvements. Working within The Nevada Division of Public and Behavioral Health (DPBH), Office of State Epidemiology Informatics team, the Data Engineer will deliver the architecture needed for data generation, storage, processing, and analysis in support of the AWS Data Lake. The Data Engineer will collaborate with epidemiologists, data content experts, analysts, data scientists, and DPBH Office of Information Technology staff to identify, design and implement proposed solutions and architectures that meet the needs of the public health agency. The Data Engineer will be hired by the CDC Foundation and assigned to the Informatics team within the DPBH Office of State Epidemiology. This position is eligible for a fully remote work arrangement for U.S. based candidates.

Responsibilities

  • Design and implement distributed data processing pipelines supporting the Nevada, and Federal Public Health ecosystems 
  • Collect data from various sources, transforming and cleaning it to ensure accuracy and consistency. Load data into storage systems or data warehouses.
  • Optimize data pipelines, infrastructure, and workflows for performance and scalability.
  • Monitor data pipelines and systems for performance issues, errors, and anomalies, and implement solutions to address them.
  • This position collaborates with Nevada’s Department of Health and Human Services and the Data Lake Governance Committee on data governance policy, data security, and data use agreements for impacted data systems required to meet the OSE mission.
  • This position will collaborate with the DPBH Office of Information Technology to implement best practices and standards for systems interoperability, including HL7 v2.x and v3.x, FHIR, RESTful State APIs, public health industry data taxonomies, database connections APIs (OBDC, JDBC, ADO, etc.), and vendor-supported information systems standards.
  • Lead and partner with internal and multiple solution vendors’ architecture/engineering leads and other integrated project team members to ensure high quality solutions through code reviews and software engineering best practices documentation. 
  • Collaborate with Data Owners, Data Stewards, Information Technologists, Public Health Analysts, and other public health resources to identify, move, transform and curate public health data in support of Data Scientists, Epidemiologists, and Biostatisticians big data analytics and data products creation.
  • Design, Develop, Modernize/Migrate pipelines to the AWS Data Lake. 
  • Engage and collaborate with internal and multiple solution vendors’ DevOps by building utilities, user defined functions and frameworks to better enable data flow patterns. 
  • Implement and maintain ETL processes to ensure the accuracy, completeness, consistency and security of data.
  • Design and manage data storage systems, including relational databases, NoSQL databases, and data warehouses.
  • Collaborate on data modeling, forecasting, and visualization projects.
  • Collaborate with AWS Data Lake Project Director on data governance policies.
  • Ensure compliance with Nevada security regulations and standards.
  • Establish data use agreements for relevant data systems.
  • Develop strategies and protocols to monitor data quality.
  • Implement mechanisms to identify and rectify data errors promptly.
  • Establish processes for ongoing data quality assessment and improvement.
  • Knowledgeable about industry trends, best practices, and emerging technologies in data engineering, and incorporating the trends into the organization's data infrastructure.
  • Collaborate with stakeholders to develop and participate in a ‘lessons learned’ and shared knowledge cohort.
  • Prepare documentation and share improvements, success stories, challenges, and opportunities throughout both State and CDC meetings/events.
  • Attend training and learning opportunities addressing technical skillsets and other related information systems improvement subjects.
  • Communicate effectively with partners at all levels of the organization to gather requirements, provide updates, and present findings.

Qualifications

  • Required Qualifications:
  • Bachelor's degree in Computer Science, Information Technology, Data Science, or a related field.
  • Minimum 5 years of relevant professional experience.
  • Proficiency in programming languages commonly used in data engineering, such as Python, R, Java, Scala, or SQL. Candidate should be able to implement data automations within existing frameworks as opposed to writing one off scripts.
  • Experience with big data technologies and frameworks like Hadoop, Spark, Kafka, and Flink.
  • Strong understanding of database systems, including relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., MongoDB, Cassandra).
  • Experience regarding engineering best practices such as source control, automated testing, continuous integration and deployment, and peer review.
  • Knowledge of data warehousing concepts and tools.
  • Experience with cloud computing platforms.
  • Expertise in data modeling, ETL (Extract, Transform, Load) processes, and data integration techniques.
  • Understanding of the underlying data models and structures used by EPIC or Cerner EHR systems, including the relational database schemas, table relationships, and data dictionaries. 
  • Knowledge of common data elements, terminology standards, and coding systems (e.g., ICD-10, SNOMED CT) used to represent clinical concepts and patient information within EHR systems. 
  • Experience in mapping and transforming EHR data into standardized formats for integration with other healthcare systems and analytics platforms. 
  • Knowledge of the ONC United States Core Data for Interoperability (USCDI).
  • Deep understanding of HIPAA, FERPA, and HITECH.  
  • Familiarity with agile development methodologies, software design patterns, and best practices.
  • Strong analytical thinking and problem-solving abilities.
  • Excellent verbal and written communication skills, including the ability to convey technical concepts to non-technical partners effectively.
  • Flexibility to adapt to evolving project requirements and priorities.
  • Outstanding interpersonal and teamwork skills; and the ability to develop productive working relationships with colleagues and partners.
  • Experience working in a virtual environment with remote partners and teams
  • Proficiency in Microsoft Office products.

  • Preferred Qualifications:
  • Master’s degree in Computer Science, Health Informatics, Mathematics or related discipline.
  • 3-5 years of experience in large-scale software development/Big Data technologies. 
  • 3-5 years of experience in data engineering with an emphasis on data identification, data streams, data curation, data transformation, and delivery of data products. 
  • 3-5 years of experience with cloud platforms and public health data solutions: Amazon Web Services (AWS), EHR, Immunization, Vital Records, Health Surveillance, etc.   
  • 3-5 years of experience in SQL, data transformations, statistical analysis, and troubleshooting across more than one Database Platform (Cache’, MySQL, Snowflake, PostgreSQL, Redshift, MS SQL, etc.). 
  • 3-5 years of experience developing with HL7 and FHIR HL7 interoperability standards. 
  • 3-5 years of experience in the design and build of data extraction, transformation, and loading (ETL) processes by writing custom data pipelines. 
  • 3-5 years of experience with one or more of the following scripting languages: Python, R, SQL, and/or others. 
  • 3-5 years of experience designing and building solutions utilizing AWS Cloud services such as Lambda, Glue, Athena, API gateway, DynmoDB, etc. 
Special NotesThis role is involved in a dynamic public health program. As such, roles and responsibilities are subject to change as situations evolve. Roles and responsibilities listed above may be expanded upon or updated to match priorities and needs, once written approval is received by the CDC Foundation in order to best support the public health programming.
All qualified applicants will receive consideration for employment and will not be discriminated against on the basis of race, color, religion, sex, national origin, age, mental or physical disabilities, veteran status, and all other characteristics protected by law.
We comply with all applicable laws including E.O. 11246 and the Vietnam Era Readjustment Assistance Act of 1974 governing employment practices and do not discriminate on the basis of any unlawful criteria in accordance with 41 C.F.R. §§ 60-300.5(a)(12) and 60-741.5(a)(7). As a federal government contractor, we take affirmative action on behalf of protected veterans.
The CDC Foundation is a smoke-free environment.  Relocation expenses are not included. Apply

Job Profile

Regions

North America

Countries

United States

Restrictions

Fully remote Must be based in the United States

Benefits/Perks

Fully remote Grant funded opportunity Health benefits Health protection mission Limited-term opportunity Remote work Remote work arrangement Remote work option

Tasks
  • Collaborate with stakeholders
  • Collect and clean data
  • Design data processing pipelines
  • Gather requirements
  • Implement best practices
  • Load data into storage
  • Load data into storage systems
  • Manage data storage systems
  • Monitor data systems
  • Optimize data pipelines
  • Present findings
Skills

ADO Agile Development Analytics Automated Testing AWS C Cassandra Cloud Cloud Computing Collaboration Communication Compliance Continuous Integration Data analysis Database Connections Data engineering Data Governance Data Infrastructure Data Integration Data Modeling Data Pipelines Data processing Data Quality Data Science Data Security Data storage Data Systems Data Transformation Data Warehousing Deployment DevOps Documentation Engineering Epidemiology ETL ETL Processes FHIR Flink Hadoop Health Informatics HL7 Informatics Information Technology Infrastructure Integration Interoperability Java JDBC Kafka Modeling MongoDB MySQL NoSQL NoSQL databases ODBC Peer review PostgreSQL Problem-solving Public health Public Health Data Python R Relational databases RESTful API's Scala Security Snowflake Software Development Software Engineering Source Control Spark SQL Statistical analysis Training

Education

Bachelor's Bachelor's degree Bachelor's degree in Computer Science Computer Science Data Science Epidemiology Information Systems Information Technology Public health Related Field Software Engineering

Timezones

America/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9