EngineerJobs.io
← Back to all jobs
Arcadia.io

Lead Analytics Engineer - Data Modeling & Quality

Remote Remote $160k - $185k/yr Full time Posted 5d ago

Job Description

Lead Analytics Engineer focusing on data modeling and data quality, owning the DBT/SQL layer to transform clinical and claims data into trusted datasets with strong data quality ownership and cross-functional collaboration within Arcadia's healthcare analytics platform.

Responsibilities

  • Author, review, and maintain DBT models built on Spark/Hudi, spanning ingest through bronze and silver layers.
  • Clarify data model concepts for clients by communicating assumptions and limitations via deliberate validation.
  • Troubleshoot issues, implement fixes, and create DBT tests to proactively prevent problems.
  • Optimize SQL performance for slow-running pipelines and queries.
  • Collaborate with Data Engineering on Hudi table design, partition strategies, and incremental processing patterns.
  • Triage data quality alerts, distinguishing source-level problems from transform-layer failures.
  • Design and maintain volume monitors and data quality monitors (null rates, distribution checks, future-date validations).
  • Author and enforce clinical DQ rules (entity volume, field coverage, LOINC coverage, referential integrity) and claims validation rules across silver and gold layers.
  • Lead quality reviews for connector promotions, evaluating silver entity coverage, validation pass rates, and bronze-to-silver transformation correctness.
  • Own the ticket queue for data quality, attribution, hierarchy, and customer-specific data quality issues, delivering clear, customer-facing findings.
  • Guide data quality reviews during connector installation and promotion (UAT/PRD), including claims validation playbooks and null analyses.
  • Partner with Data Engineering on root-cause analysis for errors, ingestion anomalies, and silver table issues surfaced by monitoring.
  • Coordinate with the Measure Implementation Team when data quality affects quality measure scores.
  • Contribute to and enforce data modeling standards across teams.
  • Data modeling stack encompasses DBT-Spark, SQL, Claude; warehousing with Redshift, Hudi, and AWS Athena; orchestration via Argo Workflows and Airflow; observability with Grafana and Loki; issue tracking in Jira.
  • Maintain robust source control through Git and GitHub with PR-based workflows.
  • Work with healthcare data domains including claims (plan, professional, pharmacy), EHR clinical entities, and MPI.

Requirements

  • Bachelor's or Master's degree in Computer Science, Statistics, Business, Economics, or a related field.
  • Advanced SQL skills with window functions, complex CTEs, multi-step aggregations, and performance tuning on columnar databases.
  • Hands-on DBT experience: authoring models, tests, macros, and YAML documentation; familiarity with incremental strategies.
  • Healthcare data literacy covering claims data (professional, institutional, pharmacy), clinical data (EHR entities), and quality dimensions (member months, coverage rates, null patterns).
  • Data quality mindset with the ability to separate source data issues from transform problems, design systematic validation checks, and communicate findings clearly.
  • Clear communicator capable of translating technical insights for clients and non-technical stakeholders.
  • Strong analytical judgment and the ability to identify anomalies in data distributions.
  • Ability to manage multiple projects concurrently, leveraging AI tooling for organization and efficiency.
  • Genuine interest in learning and applying AI tools to improve operations and processes.

Preferred qualifications

  • Experience with Spark SQL and the Hudi table format.
  • Familiarity with data quality monitoring tools.
  • Comfort working in an AI-first environment using Claude to build and verify workflows.
  • Exposure to population health analytics concepts such as HEDIS measures, risk adjustment, and value-based care metrics.
  • Python scripting for data investigation and automation.
  • Experience with Argo Workflows or similar orchestration platforms.
  • Healthcare data standards familiarity: ICD-10, CPT, NDC, LOINC, NPI.

Technologies

  • DBT-Spark
  • SQL
  • Claude
  • Amazon Redshift
  • Apache Hudi
  • AWS Athena
  • Argo Workflows
  • Airflow
  • Git
  • GitHub
  • Grafana
  • Loki
  • Jira

Benefits

  • Collaborate with a talented team tackling complex healthcare data challenges.
  • Flexible, fully remote work environment with strong support resources.
  • Exposure to senior leadership and strategic initiatives.
  • Be at the forefront of AI adoption, leveraging cutting-edge tools to accelerate work and shape team processes.
  • Help improve data quality and reliability that inform patient care decisions.
  • Join a mission-driven organization transforming the healthcare industry.
  • Become part of a diverse, energized Arcadian community aligned with purpose.

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.