Paper Summary: DTATG: An Automatic Title Generator Based on Dependency Trees
Last updated:Please note This post is mainly intended for my personal use. It is not peer-reviewed work and should not be taken as such.
WHAT
Authors propose a method called Dependency-Tree Automatic Title Generator (DTATG), a strategy to build appropriate titles for news articles based on heuristics and syntax parsing.
WHY
Because other similar methods are ineffective (generate only unordered sets of words) or only work on limited domains.
HOW
DTATG is thought of as a series of steps:
Extract keywords from the document
Extract sentences using a text segmentation technique and rank them wrt. how well they summarize the content of the document. These are candidate sentences.
Parse candidate sentences using a Dependency Parser and trim out unimportant bits
Filter out candidate sentences that fail empirical rules (title tests) as to what makes up good titles
CLAIMS
Authors claim their method generated titles that are comparable to the original document titles. They are measured subjectively across 3 dimensions:
- Topic relevance (how relevant is the generated title wrt. the document content?)
- Conciseness (how succint and clear is the generated title?)
- Fluency (how grammatically correct is the generated title?)
NOTES
- DTATG only works if you can extract sentences from the document and there must be sentences that sumamrize the contents of the text.
MY 2¢
- I think this would more accurately be described as a candidate title ranker method instead of a title generator method instead. This is because it requires that you have some candidate sentences beforehand.