Paper Summary: DTATG: An Automatic Title Generator Based on Dependency Trees

Paper Summary: DTATG: An Automatic Title Generator Based on Dependency Trees

Last updated:

Please note This post is mainly intended for my personal use. It is not peer-reviewed work and should not be taken as such.

WHAT

Authors propose a method called Dependency-Tree Automatic Title Generator (DTATG), a strategy to build appropriate titles for news articles based on heuristics and syntax parsing.

WHY

Because other similar methods are ineffective (generate only unordered sets of words) or only work on limited domains.

HOW

DTATG is thought of as a series of steps:

  • Extract keywords from the document

  • Extract sentences using a text segmentation technique and rank them wrt. how well they summarize the content of the document. These are candidate sentences.

  • Parse candidate sentences using a Dependency Parser and trim out unimportant bits

  • Filter out candidate sentences that fail empirical rules (title tests) as to what makes up good titles

CLAIMS

  • Authors claim their method generated titles that are comparable to the original document titles. They are measured subjectively across 3 dimensions:

    • Topic relevance (how relevant is the generated title wrt. the document content?)
    • Conciseness (how succint and clear is the generated title?)
    • Fluency (how grammatically correct is the generated title?)

NOTES

  • DTATG only works if you can extract sentences from the document and there must be sentences that sumamrize the contents of the text.

MY 2¢

  • I think this would more accurately be described as a candidate title ranker method instead of a title generator method instead. This is because it requires that you have some candidate sentences beforehand.

References