queirozf.com
Navigation
Tags
Archive
Archive
Other Writing
Other Writing
About
About
QUEIROZF.COM
Home
Entries by tag:
instruction-tuning
Including child/synonym tags
Paper Summary: KTO: Model Alignment as Prospect Theoretic Optimization
21 Jul 2025
paper-summary
instruction-tuning
language-modeling
Summary of the 2024 article "KTO: Model Alignment as Prospect Theoretic Optimization" AKA the KTO paper by Ethayarajh et al.
Read More ›
Paper Summary: A General Theoretical Paradigm to Understand Learning from Human Preferences
21 Jul 2025
paper-summary
instruction-tuning
language-modeling
Summary of the 2023 article "A General Theoretical Paradigm to Understand Learning from Human Preferences" (AKA the IPO paper) by Azar et al.
Read More ›
Paper Summary: Fine-Tuning Language Models from Human Preferences
20 Jul 2025
paper-summary
language-modeling
instruction-tuning
Summary of the 2019 article "Fine-Tuning Language Models from Human Preferences" by Ziegler et al.
Read More ›
Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
06 Oct 2024
paper-summary
alignment
instruction-tuning
Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al.
Read More ›
Paper Summary: Multitask Prompted Training Enables Zero-Shot Task Generalization
31 Mar 2024
paper-summary
instruction-tuning
language-modeling
Summary of the 2021 article "Multitask Prompted Training Enables Zero-Shot Task Generalization" by Sahn et al. AKA the T0 (T-zero) article
Read More ›
Paper Summary: Zephyr: Direct Distillation of LM Alignment
02 Jan 2024
paper-summary
instruction-tuning
Summary of the 2023 article "Zephyr: Direct Distillation of LM Alignment" by Tunstall et al.
Read More ›
Paper Summary: Constitutional AI
16 Nov 2023
paper-summary
instruction-tuning
language-models
Summary of the 2022 article "Constitutional AI" by Anthropic.
Read More ›