Have you ever wondered how the magic of AI and machine learning can be used to assign Initiatives to various activities? The Named Entity Recognition (NER) Similarity model does just that. It uses AI to identify the activity with the most similar entities (like names, places, and acronyms), for which an Initiative is known.

What is the NER Similarity Model?

The NER Similarity model uses a combination of machine learning tools to analyze text and identify Initiatives. It continuously learns from new activities, becoming better and more precise with each new dataset it encounters.

The NER Similarity Model is comprised of 3 components:

Pretrained NER Transformer: Think of this as the model's 'eyes'. It scans and recognizes important pieces of information in the text, such as names, places, or special terms.
Tf-idf Vectorizer: This can be seen as the model's 'translator'. It takes the information identified by the 'eyes' and converts it into a language (in the form of vectors) that the model can understand and process. It gives more importance to words that appear frequently in a document but are rare in other documents, making them stand out.
Similarity Measure: This acts like the model's 'brain'. It uses a method called cosine similarity to compare the vectors and determine how closely related different pieces of information are. This helps the model decide which activities match best with which Initiatives.

Experimentation Performance

We've put our NER Similarity Model through extensive testing to ensure it's both effective and reliable. The model now makes predictions only when it's confident about the similarity between elements, steering clear of almost identical ones. This careful approach has led to a notable increase in accuracy, with the model being correct about 95% of the time when it makes a prediction. 🙌

Because of this significant improvement, we've expanded the use of the model. It's proven to be a potent tool in accurately pairing Initiatives with the right activities, underscoring its reliability and efficiency.