tokenization using indic NLP library

<p>Hello! I should say नमस्ते since today&rsquo;s topic is regarding Indian language.</p> <p><strong>Natural Language Processing</strong>&nbsp;looks fascinating but it&rsquo;s similar to Machine Learning where we need data cleaning and data pre-processing.</p> <p>Sounds boring right? &nbsp;But it&rsquo;s not our mistake&hellip;machines never tried to learn human languages . It was us who generously learnt numbers to communicate with them . Jokes apart, when we talk data pre-processing,&nbsp;<strong>Tokenization&nbsp;</strong>is an integral part of this. Basically, we split the text further into units called&nbsp;<strong>tokens&nbsp;</strong>which can be words or characters.</p> <p><a href="https://mrraghav.medium.com/tokenization-using-indic-nlp-library-257a9a44a272"><strong>Click Here</strong></a></p>
Tags: NLP Library