Paper Summary: Self-instruct: Aligning Language Models with Self-generated Instructions

Paper Summary: Self-instruct: Aligning Language Models with Self-generated Instructions

Last updated:

Please note This post is mainly intended for my personal use. It is not peer-reviewed work and should not be taken as such.

self-instrut-article-image Self-Instruct: Aligning Language Models with Self-Generated Instructions
Source

WHAT

A way to fine-tune LLMs to follow instructions using only information from the model itself—no human annotation needed.

WHY

Because human-annotated datasets are expensive to come by.

HOW

  • 1) Use the pre-trained LLM itself to generate input/output instruction pairs, from a small set of seed pairs (one seed example per task, 175 examples in total).

  • 2) Perform supervised fine-tuning with the pairs from step 1), using heuristics to classify which outputs are better than others.

CLAIMS

  • In one experiment, GPT3self-instruct hits 44.4% of correct answers while InstructGPT (GPT3 aligned with RLHF) hits 50.7%.

NOTES

  • All tasks are represented in the form (task definition, input/output pairs). It's a versatile way to represent any kind of task. Example below:

sample-input-output-pairs-self-instruct How the authors represent the instruction tasks to align the model.
Source

MY 2¢

The contribution is how to generate alignment examples from a vanilla LLM.


References

1:Such as InstructGPT/ChatGPT which are based on RHLF