[NLP] Basics: Measuring The Linguistic Complexity of Text

Determining the linguistic complexity of text is one of the first basic steps you learn in natural language processing. In this article, I take you through what linguistic complexity is and how to measure it. <h1>The pre-processing steps</h1> First, you need to proceed with the tokenisation of your corpora. In other words, you need to break the sentences in your corpus of text into separate words (tokens). In addition, you should also remove punctuation, symbols, numbers and transform all words to lowercase. In the line of code below, I show you how to this using the quanteda package. <a href="https://towardsdatascience.com/linguistic-complexity-measures-for-text-nlp-e4bf664bd660">Visit Now</a>