Choudhury, Sourav
Deep Natural Language Generation Using BERT for Summarization
1 online resource (67 pages) : PDF
2020
University of North Carolina at Charlotte
Summarization is one of the core facets of Natural Language Processing. Text summarization is the task of producing a concise and fluent summary while holding the most essential or salient part of the content and preserving the original meaning. The main aim of abstractive summarization is to generate concise version of original text while keeping intact the meaning. Since manual text summarization is a time expensive and generally a laborious task, the automatization of the task has gained immense popularity and therefore constitutes a strong motivation for academic research.There has been some prime application of text summarization in current day such as news summarization, opinion summarization and headline generation to name a few.We will be looking into two main summarization techniques used in current day NLP tasks; extractive and abstractive. Extractive summarization generates summary by selecting salient sentences or phrases from the source text, while abstractive methods paraphrase and restructure sentences to compose the summary.Our main focus here would be on abstractive summarization as it is more flexible and can generate more diverse summaries. BERT (Bidirectional Encoder Representation from Transformers)is primarily a transformer based architecture which has been able to overcome the limitations of Recurrent Neural Networks (RNN) as long term dependencies. In this work we use a unique document level encoder-decoder based on BERT which works on a two stage process. For the first stage we use encoders to encode the input sequence into context-rich representations, for the decoder we use a transformer based decoder to generate a output sequence. must have an abstract
masters theses
Computer scienceArtificial intelligence
M.S.
AttentionBERTNatural Language ProcessingSequence to SequenceText SummarizationTransformers
Computer Science
Shaikh, Samira
Levens, SaraGallicano, Tiffany
Thesis (M.S.)--University of North Carolina at Charlotte, 2020.
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). For additional information, see http://rightsstatements.org/page/InC/1.0/.
Copyright is held by the author unless otherwise indicated.
Choudhury_uncc_0694N_12670
http://hdl.handle.net/20.500.13093/etd:2100