queirozf.com

Entries by tag: instruction-tuning

Including child/synonym tags

Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study  06 Oct 2024    paper-summary alignment instruction-tuning
Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al. Read More ›

Paper Summary: Multitask Prompted Training Enables Zero-Shot Task Generalization  31 Mar 2024    paper-summary instruction-tuning language-modeling
Summary of the 2021 article "Multitask Prompted Training Enables Zero-Shot Task Generalization" by Sahn et al. AKA the T0 (T-zero) article Read More ›

Paper Summary: Zephyr: Direct Distillation of LM Alignment  02 Jan 2024    paper-summary instruction-tuning
Summary of the 2023 article "Zephyr: Direct Distillation of LM Alignment" by Tunstall et al. Read More ›

Paper Summary: Constitutional AI  16 Nov 2023    paper-summary instruction-tuning language-models
Summary of the 2022 article "Constitutional AI" by Anthropic. Read More ›