Deep Natural Language Generation Using BERT for Summarization

Choudhury, Sourav

Deep Natural Language Generation Using BERT for Summarization

Search for this publication on Google Scholar

Choudhury, S. (2020). Deep Natural Language Generation Using BERT for Summarization. Unc Charlotte Electronic Theses And Dissertations.

Download PDF

Analytics

133 views ◎
75 downloads ⇓

Abstract

Summarization is one of the core facets of Natural Language Processing. Text summarization is the task of producing a concise and fluent summary while holding the most essential or salient part of the content and preserving the original meaning. The main aim of abstractive summarization is to generate concise version of original text while keeping intact the meaning. Since manual text summarization is a time expensive and generally a laborious task, the automatization of the task has gained immense popularity and therefore constitutes a strong motivation for academic research.There has been some prime application of text summarization in current day such as news summarization, opinion summarization and headline generation to name a few.We will be looking into two main summarization techniques used in current day NLP tasks; extractive and abstractive. Extractive summarization generates summary by selecting salient sentences or phrases from the source text, while abstractive methods paraphrase and restructure sentences to compose the summary.Our main focus here would be on abstractive summarization as it is more flexible and can generate more diverse summaries. BERT (Bidirectional Encoder Representation from Transformers)is primarily a transformer based architecture which has been able to overcome the limitations of Recurrent Neural Networks (RNN) as long term dependencies. In this work we use a unique document level encoder-decoder based on BERT which works on a two stage process. For the first stage we use encoders to encode the input sequence into context-rich representations, for the decoder we use a transformer based decoder to generate a output sequence. must have an abstract

Details

Author: Choudhury, Sourav
Title: Deep Natural Language Generation Using BERT for Summarization
Physical Description: 1 online resource (67 pages) : PDF
Date: 2020
Degree Granting Institution: University of North Carolina at Charlotte
Abstract: Summarization is one of the core facets of Natural Language Processing. Text summarization is the task of producing a concise and fluent summary while holding the most essential or salient part of the content and preserving the original meaning. The main aim of abstractive summarization is to generate concise version of original text while keeping intact the meaning. Since manual text summarization is a time expensive and generally a laborious task, the automatization of the task has gained immense popularity and therefore constitutes a strong motivation for academic research.There has been some prime application of text summarization in current day such as news summarization, opinion summarization and headline generation to name a few.We will be looking into two main summarization techniques used in current day NLP tasks; extractive and abstractive. Extractive summarization generates summary by selecting salient sentences or phrases from the source text, while abstractive methods paraphrase and restructure sentences to compose the summary.Our main focus here would be on abstractive summarization as it is more flexible and can generate more diverse summaries. BERT (Bidirectional Encoder Representation from Transformers)is primarily a transformer based architecture which has been able to overcome the limitations of Recurrent Neural Networks (RNN) as long term dependencies. In this work we use a unique document level encoder-decoder based on BERT which works on a two stage process. For the first stage we use encoders to encode the input sequence into context-rich representations, for the decoder we use a transformer based decoder to generate a output sequence. must have an abstract
Genre: masters theses
Subjects--Topics: Computer science
Artificial intelligence
Degree: M.S.
Keywords: Attention
BERT
Natural Language Processing
Sequence to Sequence
Text Summarization
Transformers
Subject Area: Computer Science
Advisor(s): Shaikh, Samira
Committee Members: Levens, Sara
Gallicano, Tiffany
Degree Note: Thesis (M.S.)--University of North Carolina at Charlotte, 2020.
Rights Statement: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). For additional information, see http://rightsstatements.org/page/InC/1.0/.
Rights Holder Information: Copyright is held by the author unless otherwise indicated.
Identifier: Choudhury_uncc_0694N_12670
Permalink: http://hdl.handle.net/20.500.13093/etd:2100

J. Murrey Atkins Library

J. Murrey Atkins Library