USE – Universal Sentence Encoder
At a high level, it consists of an encoder that summarizes any sentence to give a sentence embedding which can be used for any NLP task.
The encoder part comes in two forms and either of them can be used
- Transformer – Here the encoder part of the original transformer architecture is used. The architecture consists of 6 stacked transformer layers. Each layer has a self-attention module followed by a feed-forward network. The output context-aware word embeddings are added element-wise and divided by the square root of the length of the sentence to account for the sentence-length difference. We get a 512-dimensional vector as output sentence embedding.
- Deep averaging network- the embeddings for word and bi-grams present in a sentence are averaged together. Then, they are passed through a 4-layer feed-forward deep DNN to get 512-dimensional sentence embedding as output. The embeddings for word and bi grams are learned during training.
Training of the USE
The USE is trained on a variety of unsupervised and supervised tasks such as Skipthoughts, NLI, and more using the below principles.
- Tokenize the sentences after converting them to lowercase
- Depending on the type of encoder, the sentence gets converted to a 512-dimensional vector
- The resulting sentence embeddings are subsequently used to update the model parameters.
- The trained model is then once again applied to produce a fresh 512-dimensional sentence embedding.
Python Implementation
We load the Universal Sentence Encoder’s TF Hub module.
- module_url contains the URL to load the Universal Sentence Encoder (version 4) from TensorFlow Hub.
- The hub. load function is used to load the Universal Sentence Encoder model from the specified URL (module_url)
- We define a function named embed that takes an input text and returns the embeddings using the loaded Universal Sentence Encoder model.
Python3
import tensorflow as tf import tensorflow_hub as hub import matplotlib.pyplot as plt import numpy as np import os import pandas as pd import re import seaborn as sns module_url = "https://tfhub.dev/google/universal-sentence-encoder/4" model = hub.load(module_url) print ( "module %s loaded" % module_url) def embed( input ): return model( input ) |
We draw the similarity score on our sample data below
Python3
from scipy.spatial import distance test = [ "I liked the movie very much" ] print ( 'Test Sentence:' ,test) test_vec = embed(test) # Sample sentence sentences = [[ "The movie is awesome and It was a good thriller" ], [ "We are learning NLP throughg w3wiki" ], [ "The baby learned to walk in the 5th month itself" ]] for sent in sentences: similarity_score = 1 - distance.cosine(test_vec[ 0 ,:],embed(sent)[ 0 ,:]) print (f '\nFor {sent}\nSimilarity Score = {similarity_score} ' ) |
Output
Test Sentence: ['I liked the movie very much']
For ['The movie is awesome and It was a good thriller']
Similarity Score = 0.6519516706466675
For ['We are learning NLP throughg w3wiki']
Similarity Score = 0.06988027691841125
For ['The baby learned to walk in the 5th month itself']
Similarity Score = -0.01121298223733902
Different Techniques for Sentence Semantic Similarity in NLP
Semantic similarity is the similarity between two words or two sentences/phrase/text. It measures how close or how different the two pieces of word or text are in terms of their meaning and context.
In this article, we will focus on how the semantic similarity between two sentences is derived. We will cover the following most used models.
- Dov2Vec – An extension of word2vec
- SBERT – Transformer-based model in which the encoder part captures the meaning of words in a sentence.
- InferSent -It uses bi-directional LSTM to encode sentences and infer semantics.
- USE (universal sentence encoder) – It’s a model trained by Google that generates fixed-size embeddings for sentences that can be used for any NLP task.
Contact Us