Home
Categories
EXPLORE
True Crime
Comedy
Business
Sports
Society & Culture
Health & Fitness
TV & Film
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts126/v4/d8/1a/22/d81a22e4-45ee-87ac-084e-fce8ec2be64f/mza_15480890702391362489.jpg/600x600bb.jpg
Snacks Weekly on Data Science
Pan Wu
120 episodes
6 days ago
This podcast is about making data science and machine learning knowledge accessible and less intimidating. Every week, I will handpick one selected industrial tech blog to break it down. We will discuss some key data science concepts and machine learning algorithms, and how they are applied in those real-world applications. Subscribe to the channel and enjoy Snacks Weekly on Data Science!
Show more...
Education
RSS
All content for Snacks Weekly on Data Science is the property of Pan Wu and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
This podcast is about making data science and machine learning knowledge accessible and less intimidating. Every week, I will handpick one selected industrial tech blog to break it down. We will discuss some key data science concepts and machine learning algorithms, and how they are applied in those real-world applications. Subscribe to the channel and enjoy Snacks Weekly on Data Science!
Show more...
Education
Episodes (20/120)
Snacks Weekly on Data Science
Product Recommendations with LLMs and Word2Vec [CVS Health]

In this episode, we explore how CVS Health builds its product recommendation system to deliver relevant, timely suggestions across millions of customers and thousands of products. We look at the business motivation behind personalization at CVS, and then walk through how the team uses Word2Vec, Euclidean distance, LLM-generated product summaries, and iterative refinement to improve the system step by step.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/cvs-health-tech-blog/enhancing-you-may-also-like-ymal-systems-using-llms-and-word2vec-0340280019d2

Show more...
23 hours ago
8 minutes 55 seconds

Snacks Weekly on Data Science
Building AI Agents at Airtable [Airtable]

In this episode, we explore how Airtable built AI Agents—a system that lets users automate workflows using natural language. We examine the business motivation behind making automation more accessible and break down the technical architecture that ensures these agents are safe, reliable, and tightly integrated into Airtable’s platform.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/airtable-eng/how-we-built-ai-agents-at-airtable-70838d73cc43

Show more...
1 week ago
10 minutes 3 seconds

Snacks Weekly on Data Science
Quick Thoughts and Reflections at the End of 2025

In this episode, I share a few key observations and reflections drawn from the tech blogs I read throughout 2025. The themes include the rise of real-world LLM applications, a move toward deeply customized machine learning solutions, and the evolving skill sets in data and AI, with continuous learning becoming more important than ever.

I’d also like to express my sincere appreciation to everyone who has listened, read, engaged with, or shared my posts and podcasts this year. Thank you for making this journey so rewarding and fun. I wish you a restful holiday season and an inspiring start to the new year.

Show more...
2 weeks ago
8 minutes 7 seconds

Snacks Weekly on Data Science
Real-time Spatial and Temporal Forecasting [Lyft]

In this episode, we explore how Lyft identified the right algorithmic approach for building a real-time spatial-temporal forecasting system. The team evaluated two major model families for this task: classical time-series models and deep neural networks. This study highlights the balance between accuracy and practicality—and serves as a valuable guide for choosing machine learning solutions that truly meet business needs.

For more details, you can refer to their published tech blog, linked here for your reference: https://eng.lyft.com/real-time-spatial-temporal-forecasting-lyft-fa90b3f3ec24

Show more...
3 weeks ago
11 minutes 47 seconds

Snacks Weekly on Data Science
GenAI Solution for Invoice Document Processing [Uber]

In this episode, we explore how Uber tackled the challenge of processing an enormous volume of invoices that vary widely in layout, language, and quality. We break down how generative AI plays a central role in helping them build a more flexible and scalable document-processing system. By combining OCR, LLM-based extraction, and a thoughtful human-in-the-loop workflow, Uber created a platform that’s faster, more accurate, and far easier to maintain than traditional rule-based automation.
For more details, you can refer to their published tech blog, linked here for your reference: https://www.uber.com/blog/advancing-invoice-document-processing-using-genai

Show more...
4 weeks ago
10 minutes 10 seconds

Snacks Weekly on Data Science
Optimize Web Performance [Walmart]

In this episode, we will explore how Walmart's Engineering team tackled the challenge of optimizing web performance at scale: they set top-line targets, moved from server-centric metrics to user-centric ones like Core Web Vitals, integrated these measures into their experimentation framework, and ultimately drove measurable business impact through improved engagement and organic traffic. 

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/walmartglobaltech/walmart-journey-to-optimize-web-performance-and-drive-business-growth-c3bec8d7780b

Show more...
1 month ago
9 minutes 57 seconds

Snacks Weekly on Data Science
Understanding Metric Movement with Root Cause Analysis [Pinterest]

In this episode, we explore how Pinterest tackled one of the toughest challenges in large-scale analytics — understanding why metrics move. We discuss how their engineering team built a root cause analysis platform that combines Slice and Dice, General Similarity, and Experiment Effects, with each component addressing a different part of the problem. This system brings together analytics, statistics, and engineering into an actionable workflow, empowering teams to respond faster and with greater confidence.
For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/pinterest-engineering/the-quest-to-understand-metric-movements-8ab12ae97cda

Show more...
1 month ago
11 minutes 53 seconds

Snacks Weekly on Data Science
Improving Search Ranking for Maps [Airbnb]

In this episode, we explore how Airbnb improved search ranking for its map interface — a challenge that sits at the intersection of user behavior, design, and data science. From assuming uniform attention to modeling tiered and spatial attention, Airbnb’s team systematically refined how users interact with map results. This work shows how aligning user attention with booking likelihood can drive real business impact — improving bookings, enhancing customer satisfaction, and increasing overall platform efficiency.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/airbnb-engineering/improving-search-ranking-for-maps-13b03f2c2cca

Show more...
1 month ago
9 minutes 39 seconds

Snacks Weekly on Data Science
Out-of-Stock Product Recommendations with Machine Learning [Instacart]

In this episode, we explore how Instacart leverages machine learning to suggest smart replacements for out-of-stock products — a challenge that’s central to the grocery delivery experience. We dive into Instacart’s two-model approach, where a deep learning model uncovers general product relationships across the catalog, and an engagement model learns from customer behavior to personalize those recommendations. Together, they power a system that makes replacements more accurate, relevant, and efficient at scale.

For more details, you can refer to their published tech blog, linked here for your reference: https://tech.instacart.com/how-instacart-uses-machine-learning-to-suggest-replacements-for-out-of-stock-products-8f80d03bb5af

Show more...
1 month ago
10 minutes 40 seconds

Snacks Weekly on Data Science
Covariate Selection in Causal Inference [Booking.com]

In this episode, we explore the importance of covariate selection in causal inference and how different types of variables can influence the results. The discussion highlights why careful covariate selection is essential for generating reliable insights and enabling smarter, evidence-based business decisions.
For more details, you can refer to their published tech blog, linked here for your reference: https://booking.ai/covariate-selection-in-causal-inference-good-and-bad-controls-5f56126a984a

Show more...
2 months ago
12 minutes 30 seconds

Snacks Weekly on Data Science
Personalizing Marketing with Uplift Modeling [Klaviyo]

In this episode, we explore how Klaviyo used counterfactual learning and uplift modeling to move beyond the question of which treatment works — to the deeper question of for whom it works. We’ll see how the team combined randomized experiments, causal inference techniques, and uplift modeling to power a product that helps marketers deliver smarter, more personalized messages.
For more details, you can refer to their published tech blog, linked here for your reference: https://klaviyo.tech/the-stats-that-tell-you-what-could-have-been-counterfactual-learning-and-uplift-modeling-e95d3b712d8a

Show more...
2 months ago
9 minutes 51 seconds

Snacks Weekly on Data Science
Quick History and Fun Facts About Halloween: Pumpkins, Candies, and Costumes

In this Halloween special episode, we explore some fun facts and surprising data behind these festive favorites: Did you know Illinois is the top pumpkin-producing state, harvesting nearly 40% of all pumpkins in the U.S.? Or that Reese’s Peanut Butter Cups consistently rank as America’s most popular Halloween candy? And that over — or at least — 20% of pet owners now dress up their pets for Halloween? Now, let’s dive into these facts and the history behind the holiday. Enjoy!

Show more...
2 months ago
7 minutes 2 seconds

Snacks Weekly on Data Science
Feed Ranking: From Batch Inference to Online Inference [Whatnot]

In this episode, we explore how Whatnot improved its feed ranking system by moving from batch predictions to online inference—enabling the platform to scale effectively while capturing real-time marketplace dynamics. This evolution reflects a broader shift in recommendation systems toward more adaptive, real-time personalization.

For more details, check out the full tech blog from the Whatnot engineering team: https://medium.com/whatnot-engineering/evolving-feed-ranking-at-whatnot-25adb116aeb6

Show more...
2 months ago
7 minutes 58 seconds

Snacks Weekly on Data Science
Self-serve Experimentation Tool for Marketing [Tripadvisor]

In this episode, we explore Tripadvisor’s self-serve experimentation platform for marketing. On the business side, the challenge was measuring campaign effectiveness in a messy, external environment where clean randomization isn’t always possible. On the technical side, the TripAdvisor team developed a system that applies causal inference techniques—particularly the difference-in-differences method—to deliver reliable estimates of campaign impact.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/tripadvisor/introducing-baldur-tripadvisors-self-serve-experimentation-tool-for-marketing-7fc9933b25cc

Show more...
3 months ago
10 minutes 42 seconds

Snacks Weekly on Data Science
Global Feature Importance with Collective Wisdom [Meta]

In this episode, we look at how Meta addressed the challenge of feature selection at scale through Global Feature Importance—a system that aggregates insights across models to surface the most valuable features. This approach not only streamlines model development but also enables machine learning engineers to iterate more effectively and build models that deliver stronger business impact.
For more details, check out Meta’s published tech blog here: https://medium.com/@AnalyticsAtMeta/collective-wisdom-of-models-advanced-feature-importance-techniques-at-meta-1a7a8d2f9e27

Show more...
3 months ago
8 minutes 25 seconds

Snacks Weekly on Data Science
Evaluating Retrieval Capabilities of Language Models [Microsoft]

In this episode, we explore how to evaluate the retrieval-augmented generation (RAG) capabilities of small language models. On the business side, we discuss why RAG, long context windows, and small language models are critical for building scalable and reliable AI systems. On the technical side, we walk through the Needle-in-a-Haystack methodology and discuss key findings about retrieval performance across different models.
For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/data-science-at-microsoft/evaluating-rag-capabilities-of-small-language-models-e7531b3a5061

Show more...
3 months ago
10 minutes 1 second

Snacks Weekly on Data Science
Personalized Recommendation with Foundation Models [Netflix]

In this episode, we explore how Netflix enhanced recommendation personalization using foundation models. These models can process massive user histories through tokenization and attention mechanisms, while also addressing the cold-start problem with hybrid embeddings. The work highlights how principles from large language models can be adapted to build more effective recommendation systems at scale.

For more details, you can refer to their published tech blog, linked here for your reference: https://netflixtechblog.com/foundation-model-for-personalized-recommendation-1a0bd8e02d39

Show more...
3 months ago
11 minutes 37 seconds

Snacks Weekly on Data Science
A/B Testing vs. Multi-Armed Bandits: A Simulated Study [Vanguard]

In this episode, we explore how Vanguard evaluated standard A/B testing against multi-armed bandits for digital experimentation. Their simulated study showed that A/B testing is often the better choice when dealing with a small number of variations, while bandit strategies, such as Thompson Sampling, become more effective as the number of variations increases. The broader lesson is that experimentation design should always be context-aware—balancing simplicity, speed, and interpretability based on your business needs.


For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/vanguard-technology/smarter-web-wins-a-b-testing-vs-multi-armed-bandits-unpacked-7f5032358513

Show more...
4 months ago
10 minutes 33 seconds

Snacks Weekly on Data Science
Catalog Attribute Extraction with Multi-Modal LLMs [Instacart]

In this episode, we explore how Instacart tackled the challenge of extracting accurate product attributes at scale. We discuss different solutions—starting with SQL rules, moving to text-based ML models, and finally, Instacart’s multi-modal LLM platform, PARSE. By blending text and image data and enabling rapid configuration, PARSE demonstrates how modern AI tools can streamline data pipelines, reduce engineering overhead, and deliver better user experiences.


For more details, you can refer to their published tech blog, linked here for your reference: https://tech.instacart.com/multi-modal-catalog-attribute-extraction-platform-at-instacart-b9228754a527

Show more...
4 months ago
10 minutes 24 seconds

Snacks Weekly on Data Science
Segmenting Supply with a Data-Driven Methodology [Airbnb]

In this episode, we explore how Airbnb developed a structured framework that combines unsupervised clustering and supervised modeling to classify listings into meaningful supply personas based on availability patterns. This data-driven approach helps Airbnb enhance personalization, improve experimentation, and gain deeper insights into its global supply base.
For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/airbnb-engineering/from-data-to-insights-segmenting-airbnbs-supply-c88aa2bb9399

Show more...
4 months ago
8 minutes 17 seconds

Snacks Weekly on Data Science
This podcast is about making data science and machine learning knowledge accessible and less intimidating. Every week, I will handpick one selected industrial tech blog to break it down. We will discuss some key data science concepts and machine learning algorithms, and how they are applied in those real-world applications. Subscribe to the channel and enjoy Snacks Weekly on Data Science!