Langchain 101: Extract structured data (JSON)
<p>Based on the medium’s new policies, I am going to start with a series of short articles that deal with only practical aspects of various LLM-related software.</p>
<p><img alt="" src="https://miro.medium.com/v2/resize:fit:700/0*IOuZhyWvp0nzVHb5" style="height:467px; width:700px" /></p>
<p>Photo by <a href="https://unsplash.com/@margabagus?utm_source=medium&utm_medium=referral" rel="noopener ugc nofollow" target="_blank">Marga Santoso</a> on <a href="https://unsplash.com/?utm_source=medium&utm_medium=referral" rel="noopener ugc nofollow" target="_blank">Unsplash</a></p>
<h1>The Tutorial</h1>
<p>In this tutorial, we will learn how to extract structured data from <a href="https://arxiv.org/abs/2308.03279" rel="noopener ugc nofollow" target="_blank">f</a>ree text. Let's get some data.</p>
<pre>
# Get some text https://arxiv.org/abs/2308.03279 abstract
inp = """Large language models (LLMs) have demonstrated remarkable \
generalizability, such as understanding arbitrary entities and relations. \
Instruction tuning has proven effective for distilling LLMs \
into more cost-efficient models such as Alpaca and Vicuna. \
Yet such student models still trail the original LLMs by \
large margins in downstream applications. In this paper, \
we explore targeted distillation with mission-focused instruction \
tuning to train student models that can excel in a broad application \
class such as open information extraction. Using named entity \
recognition (NER) for case study, we show how ChatGPT can be distilled \
into much smaller UniversalNER models for open NER. For evaluation,\
we assemble the largest NER benchmark to date, comprising 43 datasets \
across 9 diverse domains such as biomedicine, programming, social media, \
law, finance. Without using any direct supervision, UniversalNER \
attains remarkable NER accuracy across tens of thousands of entity \
types, outperforming general instruction-tuned models such as Alpaca \
and Vicuna by over 30 absolute F1 points in average. With a tiny \</pre>
<p><a href="https://pub.towardsai.net/langchain-101-extract-structured-data-json-f68f5d78160e"><strong>Visit Now</strong></a></p>