Paper Summary: Cross-Task Generalization via Natural Language Crowdsourcing Instructions

Paper Summary: Cross-Task Generalization via Natural Language Crowdsourcing Instructions

Last updated: 02 Jul 2023

Please note This post is mainly intended for my personal use. It is not peer-reviewed work and should not be taken as such.

mishra-et-al-2022-instruction-following

Cross-Task Generalization via Natural Language Crowdsourcing Instructions Source

WHAT

Build a dataset with pairs of high-quality instruction-following examples;
Measure how fine-tuned models perform when trained to follow those instructions.

WHY

To provide a dataset for other people to build up on.
To examine the tradeoff between fine-tuning a smaller model vs using a much larger model

HOW

Build a dataset with examples of instructions and fine-tune a pre-trained LM on those
The datasets consist of instructions and task examples, so models are queried in a few-shot setting.

CLAIMS

LMs fine-tuned for instruction-following can generalize into task instances and even task types not seen in the training dataset.
A 170M-parameter model (BART), when fine-tuned, is better at following instructions than GPT-3 with 175B parameters.

EXTENDS/USES

BART LM(Lewis et al., 2019)

QUOTES

Authors didn't try to fine-tune GPT-3, apparently because they didn't have enough compute resources "We cannot fine-tune the parameters of [GPT-3] and use it as-is under its default setting"

NOTES

Uses ROUGE for evaluation (generated vs actual)
Examples in the evaluation set are not from different tasks as those in the training set—they are different examples of the same tasks.

MY 2¢

Why don't people use this preference dataset more often?
This is an updated version of a 2021 paper called "Natural Instructions: Benchmarking generalization to new tasks from natural language instructions". It is sometimes referenced by its old name.

References

ACL: Mishra et al., 2022: Cross-Task Generalization via Natural Language Crowdsourcing Instructions

Felipe 25 Jun 2023 02 Jul 2023 paper-summary

Dialogue & Discussion

