Topic Modeling with Llama 2

With the advent of Llama 2, running strong LLMs locally has become more and more a reality. Its accuracy approaches OpenAI’s GPT-3.5, which serves well for many use cases.

In this article, we will explore how we can use Llama2 for Topic Modeling without the need to pass every single document to the model. Instead, we are going to leverage BERTopic, a modular topic modeling technique that can use any LLM for fine-tuning topic representations.

BERTopic works rather straightforward. It consists of 5 sequential steps:

Embedding documents
Reducing the dimensionality of embeddings
Cluster reduced embeddings
Tokenize documents per cluster
Extract best-representing words per cluster

Learn More

Topic Modeling with Llama 2

Related posts

Recent posts