queirozf.com

Entries by tag: instruction-following

Including child/synonym tags

Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study  06 Oct 2024    paper-summary alignment instruction-tuning
Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al. Read More ›

Paper Summary: Llama 2: Open Foundation and Fine-Tuned Chat Models  01 Aug 2023    paper-summary instruction-following language-modeling
Summary of the 2023 article "Llama 2: Open Foundation and Fine-Tuned Chat Models" by Touvron et al. Read More ›

Paper Summary: Fine-tuned Language models are Zero-Shot Learners  02 Jul 2023    paper-summary instruction-following
Summary of the 2022 article "Fine-tuned Language models are Zero-Shot Learners" by Wei et al, aka the FLAN article. Read More ›

Paper Summary: Direct Preference Optimization: Your Language Model is Secretly a Reward Model  23 Jun 2023    paper-summary instruction-following
Summary of the 2023 article "Direct Preference Optimization: Your Language Model is Secretly a Reward Model" by Rafailov et al. Read More ›

Paper Summary: LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention  04 Jun 2023    paper-summary language-modeling instruction-following
Summary of the 2023 article "LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention" by Zhang et al. Read More ›

Paper Summary: Self-instruct: Aligning Language Models with Self-generated Instructions  03 Jun 2023    paper-summary language-modeling alignment
Summary of the 2022 article "Self-instruct: Aligning Language Models with Self-generated Instructions" by Wang et al. Read More ›

Paper Summary: Training language models to follow instructions with human feedback  05 Feb 2023    paper-summary language-models alignment
Summary of the 2022 article "Training language models to follow instructions with human feedback" by Ouyang et al. AKA the InstructGPT article Read More ›