Evaluating Text Output in NLP: BLEU at your own risk

One question I get fairly often from folks who are just getting into NLP is how to evaluate systems when the output of that system is text, rather than some sort of classification of the input text. These types of problems, where you put some text into your model and get some other text out of it, are known as sequence to sequence or string transduction problems. And they’re really interesting problems! The general task of sequence to sequence modelling is at the heart of some of the most difficult tasks in NLP, including: <a href="https://towardsdatascience.com/evaluating-text-output-in-nlp-bleu-at-your-own-risk-e8609665a213">Read More</a>