Healthcare IT & AI/ML Integration Data Engineer
6314 Remote/Teleworker US, United States
We are seeking an experienced Healthcare IT & AI/ML Data Engineer to join our team in modernizing data collection, aggregation, and analysis for the Centers for Medicare and Medicaid Services (CMS). As an IT integrator contractor supporting application development teams and business owners, we aim to enhance data transformation functionality while leveraging AI/ML-driven solutions for intelligent data modernization.
This role requires a deep understanding of data engineering, AI/ML integration, cloud-based data solutions, and compliance with federal healthcare regulations. The ideal candidate will assess current implementations, identify areas for optimization, and propose strategic improvements that align with modern data engineering best practices and CMS objectives.
Key Responsibilities:
- Assessment & Strategy:
- Evaluate existing data pipelines, architectures, and transformation processes.
- Provide recommendations for optimizing and modernizing data systems to enhance efficiency, scalability, and cost-effectiveness.
- Define a data modernization strategy that incorporates AI/ML-driven automation and analytics.
- Data Engineering & Transformation:
- Design and implement scalable, cloud-native ETL/ELT pipelines to support real-time and batch data processing.
- Build and maintain data warehouses and data lakes for high-performance analytics and reporting.
- Improve data quality, lineage, and governance with automated validation and monitoring.
- Implement data versioning and reproducibility for better traceability.
- AI/ML Integration for Data Modernization:
- Develop and integrate AI/ML solutions to enhance data ingestion, transformation, and analytics capabilities.
- Work with ML frameworks (TensorFlow, PyTorch, MLflow) to enable automated decision-making and anomaly detection.
- Collaborate with Data Scientists to operationalize machine learning models and optimize predictive analytics.
- Utilize NLP and deep learning to improve healthcare data processing and extraction.
- Cloud & Infrastructure Modernization:
- Architect cloud-based data solutions leveraging AWS, Azure, or Google Cloud (e.g., AWS Glue, Redshift, S3, Azure Synapse).
- Implement serverless and event-driven architectures for dynamic data processing.
- Ensure high availability and security in alignment with CMS and federal compliance standards (FISMA, HIPAA).
- Data Governance & Compliance:
- Ensure adherence to CMS data policies, federal security regulations, and privacy frameworks (HIPAA, FHIR, 21st Century Cures Act).
- Design data models with strong governance principles to improve auditability and regulatory reporting.
- Establish best practices for data cataloging and metadata management.
- Collaboration & Agile Practices:
- Work closely with business analysts, software engineers, data scientists, and DevSecOps teams to develop end-to-end solutions.
- Participate in Agile/Scrum development cycles, contributing to sprints and backlog grooming.
- Document technical processes, workflows, and data architecture for cross team knowledge sharing.
Required Skills & Qualifications:
- Technical Expertise:
- Proficiency in SQL, Python, and Scala for data transformation and automation.
- Experience with big data processing frameworks (Apache Spark, Databricks, Hadoop).
- Hands-on experience with ETL/ELT orchestration tools (Apache Airflow, dbt, Informatica).
- Strong knowledge of cloud-based data platforms (AWS Glue, Redshift, Azure Data Factory, Google BigQuery).
- Familiarity with containerization and orchestration (Docker, Kubernetes).
- Experience with real-time data streaming (Kafka, Kinesis, Pub/Sub) for handling large-scale data ingestion.
- Solid understanding of DataOps methodologies, CI/CD pipelines, and Infrastructure-as-Code (Terraform, CloudFormation).
- AI/ML & Advanced Analytics:
- Strong understanding of ML engineering for integrating AI models into data pipelines.
- Experience with ML lifecycle management using MLflow, TensorFlow Extended (TFX), or Kubeflow.
- Proficiency in feature engineering and model deployment in cloud environments.
- Knowledge of NLP and predictive modeling techniques for healthcare applications.
- Core Consulting Skills
- Excellent verbal and written communication skills to convey complex technical concepts to non-technical stakeholders.
- Ability to produce clear and concise documentation and reports.
- Strong problem-solving skills to identify issues and recommend effective solutions.
- Analytical thinking to assess data-related challenges and opportunities.
- Ability to research, learn new technologies/products, propose and build proof of concepts/technology demonstrators.
Preferred Skills/Certifications (Nice to Have):
- Healthcare Data & Compliance Knowledge:
- Experience working with healthcare data standards (FHIR, HL7, CCDA).
- Understanding of CMS regulatory frameworks and data security best practices.
- Cloud Certifications:
- AWS Certified Data Analytics – Specialty
- Google Cloud Professional Data Engineer
- Microsoft Certified: Azure Data Engineer Associate
- AI/ML Certifications:
- TensorFlow Developer Certification
- Databricks Certified Machine Learning Associate
- Healthcare Data & Security Certifications:
- Certified Healthcare Data Analyst (CHDA)
- Certified Information Systems Security Professional (CISSP)
Methodologies & Best Practices:
- DataOps & MLOps: Continuous integration and deployment of AI/ML models into data pipelines.
- Data Governance & Privacy: Ensuring data security, compliance, and access control for CMS projects.
- DevSecOps: Embedding security in data pipelines and cloud infrastructure.
Education & Experience:
- Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field. Additional years of experience may be substituted in lieu of degree.
- 5+ years of experience in data engineering, cloud architecture, and AI/ML-driven data processing.
Must be able to obtain and maintain a public trust clearance.
All candidates supporting the CMS programs must have lived in the United States at least three (3) out of the last five (5) years prior in order to be considered.
Original Posting Date:
2025-02-14While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above.
Pay Range:
Pay Range $85,150.00 - $153,925.00The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.
ApplyJob Profile
Remote/Teleworker US
Benefits/PerksCollaboration Collaborative environment Flexible hours Hands-on experience Public trust clearance Remote work Trust
Tasks- AI/ML integration
- Analysis
- Assess data pipelines
- Build data warehouses
- Collaborate with data scientists
- Collaborate with teams
- Collaboration
- Data Collection
- Data processing
- Design
- Design data models
- Design ETL/ELT pipelines
- Develop
- Development
- Documentation
- Document processes
- Engineering
- Establish best practices
- Implement
- Integrate AI/ML solutions
- Knowledge sharing
- Maintain
- Modeling
- Optimize data systems
- Provide Recommendations
- Reporting
- Research
Advanced Analytics Aggregation Agile AI AI/ML AI models Analysis Analytical Analytical thinking Analytics Anomaly Detection Apache Apache Airflow Apache Spark Application Development Architecture Assessment Automation AWS AWS Glue AWS Redshift Azure Azure Data Factory Best Practices Big Data Business CI/CD CI/CD pipelines CISSP Cloud Cloud Architecture Cloud environments CloudFormation Cloud Infrastructure Cloud solutions Collaboration Communication Compensation Compliance Compliance Standards Computer Computer Science Consulting Containerization Continuous Integration Data Data Architecture Databricks Data Collection Data engineering Data Factory Data Governance Data ingestion DataOps Data Pipelines Data processing Data Quality Data Science Data Security Data Solutions Data Standards Data Transformation Data Warehouses Dbt Deep Learning Deployment Design Development DevSecOps Docker Documentation Education ELT Engineering ETL Federal compliance FHIR FISMA Google Google BigQuery Google Cloud Governance Hadoop Healthcare Healthcare IT High Availability HL7 Informatica Information systems Information Systems Security Infrastructure Integration IT Kafka Kinesis Kubeflow Kubernetes Learning Machine Learning Management Metadata Management Microsoft ML MLFlow MLOps Modeling Monitoring NLP Optimization Orchestration Predictive Analytics Predictive Modeling Problem-solving Public Trust Clearance Pub/Sub Python PyTorch Reporting Research S3 Scala Scalability Scrum Security Security Best Practices Serverless Software Spark SQL Strategy Support Teams Technical Technical Expertise Technology TensorFlow Terraform Validation Workflows Written communication
Experience5 years
EducationAI Architecture AS Bachelor Business Computer Science Data Analytics Data Science Degree Design Education Engineering Information Systems IT Master Master’s Degree in Computer Science Related Field Science Security Technical Technology
CertificationsAWS Certified information systems security professional CISSP Developer Microsoft Certified Privacy Public Trust Public Trust clearance
TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9