Data Scientist
HealthLeap
Location
San Francisco Office
Employment Type
Full time
Department
Engineering
HealthLeap is an AI start-up revolutionizing healthcare through predictive analytics, initially focused on disease-related malnutrition—a critical form of patient deterioration affecting virtually every hospital condition. Our mission is to maximize health outcomes globally by building a scalable AI platform that screens patients comprehensively using electronic health records (EHRs), labs, clinical notes, and more.
We're experiencing explosive growth, with contracts and pilots at top US health systems like Cedars-Sinai, Intermountain Health, Penn Medicine, and many others. We have unprecedented access to EHR data (far more than competitors: 10s of billions of tokens per hospital + vast structured data across 100% of inpatients), positioning us to expand into additional conditions like pressure ulcers, congestive heart failure, infections, readmissions, and mortality predictions. With a unique strategy and no direct competitors (due to our 2+ year lead and regulatory advantages), we are setting out to become a $100B+ company.
Outcomes You’ll Drive
Condition expansion velocity: Idea → signal & label viability using current EHR data → validated model → customer-ready (for viable use cases, weeks, not months)
Improving patient health outcomes: Quantified length of stay (LOS) reduction, readmission reduction, mortality reduction, with clear confidence intervals and robust counterfactuals.
Pilot → production conversion: Run retrospective analyses on hospital data to prove impact, then transition validated pilots into live deployments that deliver measurable outcomes.
Role Overview
We’re looking to hire a product-minded Data Scientist with a sound theoretical knowledge foundation. You will own end-to-end problem framing, timeline scoping, experimental design, and model iteration. You'll work closely with our CEO and small team to launch new models quickly and safely by leveraging and expanding on our existing feature tables. You will also run retrospective pilots to estimate clinical and financial impact (reimbursement lift, LOS reduction, mortality reduction) and support pre-sales by meeting AI/Data Science leaders at world-class health systems to share your clinical and financial model assumptions and development methodologies.
Key Responsibilities
Own end-to-end modeling from financial incentives and problem framing to a validated model.
Estimate impact with rigorous retrospective analyses (LOS, readmissions, mortality, reimbursement).
Productionize pipelines and rollouts with reliability.
Monitor & improve: drift, calibration/uncertainty, and fairness (Independence/Separation/Sufficiency).
Translate research into pragmatic wins for our platform.
Partner with stakeholders: clear visuals, crisp narratives, and method presentation for analysts, clinicians, and executives.
Requirements
Passionate about AI's potential in healthcare; outcomes-oriented with a focus on impact, not just research.
Statistics: parametric and non-parametric tests, hypothesis testing, experimental design, confidence intervals, and causal inference basics.
ML fluency: Python, SQL; polars (or pandas), scikit-learn, XGBoost/LightGBM (PyTorch/transformers a plus); survival/time-to-event experience is great.
Visualization & storytelling: Expert at turning complex analyses into crisp user visualizations, dashboards, and narratives for clinicians and executives.
Customer-facing: Comfortable interviewing stakeholders, presenting to AI/data science leaders, and defending methods.
Read the latest research and rapidly translate new statistical/ML papers into pragmatic wins.
3 - 5+ years of relevant experience from a high-growth environment.
BS/MS in Statistics, Biostats, CS, or equivalent experience.
Resourceful, fast learner, high ownership, bias to action, fast experimentation cycles, and ability to work independently while collaborating in a small team.
Understanding of fairness: Independence, Separation, and Sufficiency
Nice-to-Haves
Background in applied AI companies with strong product traction (not hype-driven firms).
Interest in healthcare data (e.g., from research labs with practical applications).
Side projects demonstrating productionization (e.g., turning prototypes like landing agents into reliable systems).
Uncertainty quantification
Covariate and prediction drift detection in production
Hands-on experience with LLMs in production; LLMs for clinical text, weak/active/semi-supervised learning.
Strong software engineering skills with proven ML experience: Productionizing models (tabular/text data preferred; not pure vision specialists) and building scalable pipelines.
Familiarity with EHR schemas/standards (FHIR/HL7), IRB/validation study workflows, and model governance.
We Provide:
Competitive salary with performance-based incentives
Comprehensive Healthcare Benefits - we cover 100% of premiums for employees
Unlimited Paid Time Off - we need you at your best at all times. Our recommended time off of 20 PTO days per year lets you schedule your work around your life.
401K match of up to 4% of employee salary
Laptop and equipment budget to set up your at-home office environment
Lunch, snacks, and drinks are provided in the office to ensure you never go hungry :)
Opportunity for professional growth in a dynamic, fast-paced startup environment
Location: San Francisco (hybrid)
Compensation is dependent on experience, overall fit to our role, and candidate location.
If you're passionate about applying frontier AI to real-world impact, join us in building healthcare's future.