Online review community such as Trustpilot gives an NLP enthusiast an infinite dataset as a playgrond to perform NLP tasks. Our aim is that we are leveraging these datasets to perform 2 key tasks:
- Summarization: to have the pre-trained model capture only the most important parts of the text
- Key Word Extraction: to extract not only the most relevant key words, but also the proper nouns discussed in the reviews for further CRM analysis
The output we should achieve at the end of this series is to have the model return some output like the photo below (as well as running here on the HuggingFace). However, to make the content solid, this first blog of the series would focus only on the TASK1: Summarization