Predictive Disease Risk Modeling at 23andMe with Subarna Sinha

EPISODE 436

Join our list for notifications and early access to events

About this Episode

Today we're joined by Subarna Sinha, Machine Learning Engineering Leader at 23andMe. 23andMe handles a massive amount of genomic data every year from its core ancestry business but also uses that data for disease prediction, which is the core use case we discuss in our conversation. Subarna talks us through an initial use case of creating an evaluation of polygenic scores, and how that led them to build an ML pipeline and platform. We talk through the tools and tech stack used for the operationalization of their platform, the use of synthetic data, the internal pushback that came along with the changes that were being made, and what's next for her team and the platform.
Connect with Subarna

Thanks to our sponsor Pachyderm

Pachyderm is an enterprise-grade, open source data science platform that makes explainable, repeatable, and scalable machine learning and artificial intelligence possible. The platform brings together version control for data with the tools to build scalable end-to-end machine learning and artificial intelligence pipelines while empowering users to use any language, framework, or tool they want. The company is headquartered in San Francisco and is backed by Benchmark, M12, YCombinator and others.
Pachyderm Logo