LawBERT: Towards a Legal Domain-Specific BERT?

Google’s Bidirectional Encoder Representations from Transformers (BERT) is a large-scale pre-trained autoencoding language model developed in 2018. Its development has been described as the NLP community’s “ImageNet moment”, largely because of how adept BERT is at performing downstream NLP language understanding tasks with very little backpropagation and fine-tuning needed (usually only 2–4 epochs). <img alt="" src="https://miro.medium.com/v2/resize:fit:700/1*O42cqkWhFW1n2IzOGeWj7Q.png" style="height:309px; width:700px" /> Source: <a href="https://arxiv.org/abs/1810.04805" rel="noopener ugc nofollow" target="_blank">Devlin et al. (2019)</a> For context, traditional word embeddings (e.g. word2vec and GloVe) are non-contextual. They represent each word token with a single static vector, and learn by word co-occurence instead of the sequential context of words. This may be problematic when words are polysemous (i.e. where the same word has multiple different meanings), which is very common in law. <a href="https://towardsdatascience.com/lawbert-towards-a-legal-domain-specific-bert-716886522b49">Visit Now</a>