Supervised Fine-Tuning on OpenAI Models

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/00/94/e9/0094e92e-21d4-90e9-a9ea-1c9c0de51e8e/mza_9998448983973779943.jpg/600x600bb.jpg

AI Intuition

Dan Sarmiento

89 episodes

6 days ago

This is the gold rush era of artificial intelligence. You want to learn quickly so you don't get left behind, but how can you learn about AI without an advanced degree in computer science and mathematics? You translate all the complicated concepts into plain language and you summarize the relevant news into a podcast you can listen to while you do everything else. This is the method that helped me speed up my learning and maybe it can help you too.

Technology

RSS

All content for AI Intuition is the property of Dan Sarmiento and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/44026677/44026677-1751945723120-421486eaecd6d.jpg

Supervised Fine-Tuning on OpenAI Models

AI Intuition

1 hour 5 minutes 31 seconds

4 months ago

Supervised Fine-Tuning on OpenAI Models

overview of Supervised Fine-Tuning (SFT) for large language models, explaining it as a method to specialize pre-trained models for particular tasks by training them on curated, labeled datasets. It compares full fine-tuning with more efficient Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA, highlighting their trade-offs. The text then outlines practical workflows for fine-tuning both API-based and open-weight models, emphasizing the critical importance of data quality and curation. Furthermore, it examines advanced alignment techniques, positioning SFT as a foundational step for methods such as Direct Preference Optimization (DPO), and discusses essential hyperparameters and evaluation metrics. Finally, the source addresses significant risks and limitations of SFT, including catastrophic forgetting and increased hallucination, and provides strategic recommendations for its effective application in real-world scenarios.