Text Summarization
Authors: Abhishek Patidar, Aditya Sharma, Aditya Jain, Alfaiz Khan
Certificate: View Certificate
Abstract
Text summarization is a process of extracting or collecting important information from original text and presents that information in the form of summary. Text summarization has become the necessity of many applications for example search engine, business analysis, market review. Summarization helps to gain required information in less time. This paper is an attempt to summarize and present the view of text summarization from every aspect from its beginning till date. The two major approaches i.e., extractive and abstractive summarization is discussed in detail. The technique deployed for summarization ranges from structured to linguistic. In Indian many languages also the work has being done, but presently they are in infancy state. This paper provides an abstract view of the present scenario of research work for text summarization.
Introduction
The amount of data and information on the Internet continues to increase every day in the form of web pages, articles, academic papers, and news items. In spite of the abundance, it is difficult to find information needed efficiently because most information is irrelevant to a particular user’s needs at a particular time. Therefore, the need for automatic summarization and extraction of relevant information continues to be a productive research area within natural language processing. Automatic summarization helps extract useful information while discarding the irrelevant. It can also improve the readability of texts, and decrease the time that users spend in searching. Researchers have been trying to perform suitable automatic text summarization since the late 1950s. The goal is to generate summaries, combining the main points in a readable and cohesive way, without having unuseful or repeated information [1]. Text summarization methods usually extract important words, phrases or sentences from a document and use these words, phrases, or sentences to create a summary. Text summarization can be classified into single document and multi-document summarization, depending on the number of input documents. Single document text summarization only accepts one document as input [2], whereas multidocument summarization accepts more than one document, where each document is related to the main topic. Meaningful information is extracted from each document and then gathered together and organized to generate a summary [3] [4]. Extractive summarization chooses important sentences from a document and combines them to create a summary without changing the original sentences.
Conclusion
Text summarization is growing as sub – branch of NLP as the demand for compressive, meaningful, abstract of topic due to large amount of information available on net. Precise information helps to search more effectively and efficiently. Thus text summarization is need and used by business analyst, marketing executive, development, researchers, government organizations, students and teachers also. It is seen that executive requires summarization so that in a limited time required information can be processed. This paper takes into all about the details of both the extractive and abstractive approaches along with the techniques used, its performance achieved, along with advantages and disadvantages of each approach. Text summarization has its importance in both commercial as well as research community. As abstractive summarization requires more learning and reasoning, it is bit complex then extractive approach but, abstractive summarization provides more meaningful and appropriate summary compare to extractive. Through the study it is also observed that very less work is done using abstractive methods on Indian languages, there is a lot of scope for exploring such methods for more appropriate summarization
Copyright
Copyright © 2025 Abhishek Patidar, Aditya Sharma, Aditya Jain, Alfaiz Khan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.