Entries by tag: instruction-tuning

Including child/synonym tags

Paper Summary: KTO: Model Alignment as Prospect Theoretic Optimization 21 Jul 2025 paper-summary instruction-tuning language-modeling

Summary of the 2024 article "KTO: Model Alignment as Prospect Theoretic Optimization" AKA the KTO paper by Ethayarajh et al. Read More ›

Paper Summary: A General Theoretical Paradigm to Understand Learning from Human Preferences 21 Jul 2025 paper-summary instruction-tuning language-modeling

Summary of the 2023 article "A General Theoretical Paradigm to Understand Learning from Human Preferences" (AKA the IPO paper) by Azar et al. Read More ›

Paper Summary: Fine-Tuning Language Models from Human Preferences 20 Jul 2025 paper-summary language-modeling instruction-tuning

Summary of the 2019 article "Fine-Tuning Language Models from Human Preferences" by Ziegler et al. Read More ›

Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study 06 Oct 2024 paper-summary alignment instruction-tuning

Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al. Read More ›

Paper Summary: Multitask Prompted Training Enables Zero-Shot Task Generalization 31 Mar 2024 paper-summary instruction-tuning language-modeling

Summary of the 2021 article "Multitask Prompted Training Enables Zero-Shot Task Generalization" by Sahn et al. AKA the T0 (T-zero) article Read More ›

Paper Summary: Zephyr: Direct Distillation of LM Alignment 02 Jan 2024 paper-summary instruction-tuning

Summary of the 2023 article "Zephyr: Direct Distillation of LM Alignment" by Tunstall et al. Read More ›

Paper Summary: Constitutional AI 16 Nov 2023 paper-summary instruction-tuning language-models

Summary of the 2022 article "Constitutional AI" by Anthropic. Read More ›



About This Site

Technology reference and information archive. More ›

Other

Contact
Atom Feed
sitemap.xml

Credits

Theme by Phlow
Favicon by Webalys

Created with Jekyll.