Summarization is one of the core facets of Natural Language Processing. Text summarization is the task of producing a concise and fluent summary while holding the most essential or salient part of the content and preserving the original meaning. The main aim of abstractive summarization is to generate concise version of original text while keeping intact the meaning. Since manual text summarization is a time expensive and generally a laborious task, the automatization of the task has gained immense popularity and therefore constitutes a strong motivation for academic research.There has been some prime application of text summarization in current day such as news summarization, opinion summarization and headline generation to name a few.We will be looking into two main summarization techniques used in current day NLP tasks; extractive and abstractive. Extractive summarization generates summary by selecting salient sentences or phrases from the source text, while abstractive methods paraphrase and restructure sentences to compose the summary.Our main focus here would be on abstractive summarization as it is more flexible and can generate more diverse summaries. BERT (Bidirectional Encoder Representation from Transformers)is primarily a transformer based architecture which has been able to overcome the limitations of Recurrent Neural Networks (RNN) as long term dependencies. In this work we use a unique document level encoder-decoder based on BERT which works on a two stage process. For the first stage we use encoders to encode the input sequence into context-rich representations, for the decoder we use a transformer based decoder to generate a output sequence. must have an abstract