queirozf.com
Navigation
Tags
Archive
Archive
Other Writing
Contact
Contact
About
About
QUEIROZF.COM
Home
Entries by tag:
instruction-following
Including child/synonym tags
Paper Summary: Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
06 Oct 2024
paper-summary
alignment
instruction-tuning
Summary of the 2024 article "Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study" by Xu et al.
Read More ›
Paper Summary: Llama 2: Open Foundation and Fine-Tuned Chat Models
01 Aug 2023
paper-summary
instruction-following
language-modeling
Summary of the 2023 article "Llama 2: Open Foundation and Fine-Tuned Chat Models" by Touvron et al.
Read More ›
Paper Summary: Fine-tuned Language models are Zero-Shot Learners
02 Jul 2023
paper-summary
instruction-following
Summary of the 2022 article "Fine-tuned Language models are Zero-Shot Learners" by Wei et al, aka the FLAN article.
Read More ›
Paper Summary: Direct Preference Optimization: Your Language Model is Secretly a Reward Model
23 Jun 2023
paper-summary
instruction-following
Summary of the 2023 article "Direct Preference Optimization: Your Language Model is Secretly a Reward Model" by Rafailov et al.
Read More ›
Paper Summary: LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
04 Jun 2023
paper-summary
language-modeling
instruction-following
Summary of the 2023 article "LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention" by Zhang et al.
Read More ›
Paper Summary: Self-instruct: Aligning Language Models with Self-generated Instructions
03 Jun 2023
paper-summary
language-modeling
alignment
Summary of the 2022 article "Self-instruct: Aligning Language Models with Self-generated Instructions" by Wang et al.
Read More ›
Paper Summary: Training language models to follow instructions with human feedback
05 Feb 2023
paper-summary
language-models
alignment
Summary of the 2022 article "Training language models to follow instructions with human feedback" by Ouyang et al. AKA the InstructGPT article
Read More ›