USE – Universal Sentence Encoder

At a high level, it consists of an encoder that summarizes any sentence to give a sentence embedding which can be used for any NLP task.

The encoder part comes in two forms and either of them can be used

  • Transformer – Here the encoder part of the original transformer architecture is used. The architecture consists of 6 stacked transformer layers. Each layer has a self-attention module followed by a feed-forward network. The output context-aware word embeddings are added element-wise and divided by the square root of the length of the sentence to account for the sentence-length difference. We get a 512-dimensional vector as output sentence embedding.
  • Deep averaging network- the embeddings for word and bi-grams present in a sentence are averaged together. Then, they are passed through a 4-layer feed-forward deep DNN to get 512-dimensional sentence embedding as output. The embeddings for word and bi grams are learned during training.

Training of the USE

The USE is trained on a variety of unsupervised and supervised tasks such as Skipthoughts, NLI, and more using the below principles.

  • Tokenize the sentences after converting them to lowercase
  • Depending on the type of encoder, the sentence gets converted to a 512-dimensional vector
  • The resulting sentence embeddings are subsequently used to update the model parameters.
  • The trained model is then once again applied to produce a fresh 512-dimensional sentence embedding.

Training of Encoder

Python Implementation

We load the Universal Sentence Encoder’s TF Hub module.

  • module_url contains the URL to load the Universal Sentence Encoder (version 4) from TensorFlow Hub.
  • The hub. load function is used to load the Universal Sentence Encoder model from the specified URL (module_url)
  • We define a function named embed that takes an input text and returns the embeddings using the loaded Universal Sentence Encoder model.

Python3

import tensorflow as tf
 
import tensorflow_hub as hub
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import re
import seaborn as sns
 
module_url = "https://tfhub.dev/google/universal-sentence-encoder/4"
model = hub.load(module_url)
print("module %s loaded" % module_url)
 
 
def embed(input):
    return model(input)

                    


We draw the similarity score on our sample data below

Python3

from scipy.spatial import distance
 
 
test = ["I liked the movie very much"]
print('Test Sentence:',test)
test_vec = embed(test)
# Sample sentence
sentences = [["The movie is awesome and It was a good thriller"],
        ["We are learning NLP throughg w3wiki"],
        ["The baby learned to walk in the 5th month itself"]]
 
for sent in sentences:
    similarity_score = 1-distance.cosine(test_vec[0,:],embed(sent)[0,:])
    print(f'\nFor {sent}\nSimilarity Score = {similarity_score} ')

                    

Output

Test Sentence: ['I liked the movie very much']

For ['The movie is awesome and It was a good thriller']
Similarity Score = 0.6519516706466675

For ['We are learning NLP throughg w3wiki']
Similarity Score = 0.06988027691841125

For ['The baby learned to walk in the 5th month itself']
Similarity Score = -0.01121298223733902

Different Techniques for Sentence Semantic Similarity in NLP

Semantic similarity is the similarity between two words or two sentences/phrase/text. It measures how close or how different the two pieces of word or text are in terms of their meaning and context.

In this article, we will focus on how the semantic similarity between two sentences is derived. We will cover the following most used models.

  1. Dov2Vec – An extension of word2vec
  2. SBERT – Transformer-based model in which the encoder part captures the meaning of words in a sentence.
  3. InferSent -It uses bi-directional LSTM to encode sentences and infer semantics.
  4. USE (universal sentence encoder) – It’s a model trained by Google that generates fixed-size embeddings for sentences that can be used for any NLP task.

Similar Reads

What is Semantic Similarity?

Semantic Similarity refers to the degree of similarity between the words. The focus is on the structure and lexical resemblance of words and phrases. Semantic similarity delves into the understanding and meaning of the content. The aim of the similarity is to measure how closely related or analogous the concepts, ideas, or information conveyed in two texts are....

Word Embedding

To understand semantic relationships between sentences one must be aware of the word embeddings. Word embeddings are used for vectorized representation of words. The simplest form of word embedding is a one-hot vector. However, these are sparse, very high dimensional, and do not capture meaning. The more advanced form consists of the Word2Vec (skip-gram, cbow), GloVe, and Fasttext which capture semantic information in low dimensional space. Kindly look at the embedded link to get a deeper understanding of this....

Word2Vec

Word2Vec represents the words as high-dimensional vectors so that we get semantically similar words close to each other in the vector space. There are two main architectures for Word2Vec:...

Doc2Vec

Similar to word2vec Doc2Vec has two types of models based on skip gram and CBOW. We will look at the skip gram-based model as this model performs better than the cbow-based model. This skip-gram-based model is called ad PV-DM (Distributed Memory Model of Paragraph Vectors)....

SBERT

...

InferSent

SBERT adds a pooling operation to the output of BERT to derive a fixed-sized sentence embedding. The sentence is converted into word embedding and passed through a BERT network to get the context vector. Researchers experimented with different pooling options but found that at the mean pooling works the best. The context vector is then averaged out to get the sentence embeddings....

USE – Universal Sentence Encoder

...

Conclusion

The structure comprises two components:...

Contact Us