queirozf.com

Entries by tag: instruction-tuning

Including child/synonym tags

Paper Summary: KTO: Model Alignment as Prospect Theoretic Optimization  21 Jul 2025    paper-summary instruction-tuning language-modeling
Summary of the 2024 article "KTO: Model Alignment as Prospect Theoretic Optimization" AKA the KTO paper by Ethayarajh et al. Read More ›

Paper Summary: A General Theoretical Paradigm to Understand Learning from Human Preferences  21 Jul 2025    paper-summary instruction-tuning language-modeling
Summary of the 2023 article "A General Theoretical Paradigm to Understand Learning from Human Preferences" (AKA the IPO paper) by Azar et al. Read More ›

Paper Summary: Fine-Tuning Language Models from Human Preferences  20 Jul 2025    paper-summary language-modeling instruction-tuning
Summary of the 2019 article "Fine-Tuning Language Models from Human Preferences" by Ziegler et al. Read More ›

Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study  06 Oct 2024    paper-summary alignment instruction-tuning
Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al. Read More ›

Paper Summary: Multitask Prompted Training Enables Zero-Shot Task Generalization  31 Mar 2024    paper-summary instruction-tuning language-modeling
Summary of the 2021 article "Multitask Prompted Training Enables Zero-Shot Task Generalization" by Sahn et al. AKA the T0 (T-zero) article Read More ›

Paper Summary: Zephyr: Direct Distillation of LM Alignment  02 Jan 2024    paper-summary instruction-tuning
Summary of the 2023 article "Zephyr: Direct Distillation of LM Alignment" by Tunstall et al. Read More ›

Paper Summary: Constitutional AI  16 Nov 2023    paper-summary instruction-tuning language-models
Summary of the 2022 article "Constitutional AI" by Anthropic. Read More ›