Paper Summary: Self-instruct: Aligning Language Models with Self-generated Instructions

Last updated: 25 Jun 2023

Please note This post is mainly intended for my personal use. It is not peer-reviewed work and should not be taken as such.

Self-Instruct: Aligning Language Models with Self-Generated Instructions
Source

WHAT

A way to fine-tune LLMs to follow instructions using only information from the model itself—no human annotation needed.

Because human-annotated datasets are expensive to come by.

1) Use the pre-trained LLM itself to generate input/output instruction pairs, from a small set of seed pairs (one seed example per task, 175 examples in total).
- Seed data
- Generated instructions
2) Perform supervised fine-tuning with the pairs from step 1), using heuristics to classify which outputs are better than others.

In one experiment, GPT3_{self-instruct} hits 44.4% of correct answers while InstructGPT (GPT3 aligned with RLHF) hits 50.7%.

All tasks are represented in the form (task definition, input/output pairs). It's a versatile way to represent any kind of task. Example below:

How the authors represent the instruction tasks to align the model.
Source

The contribution is how to generate alignment examples from a vanilla LLM.

1:Such as InstructGPT/ChatGPT which are based on RHLF