Wav2Vec2
The architecture of HuBERT is very similar to Wav2Vec2. However, it is the training process which is very different. Lets have a brief understanding of Wav2Vec2 model.
Wav2Vec2 is a deep learning model designed for automatic speech recognition (ASR). It was developed by Facebook AI Research and introduced in 2020. Wav2Vec2 is a significant advancement in ASR technology. It builds on the original Wav2Vec model and leverages the power of transformers, with a training objective similar to BERT’s masked language modeling objective, but adapted for speech. These four important elements in Wav2Vec2 are: the feature encoder, context network, quantization module, and contrastive loss (pre-training objective).
Wav2Vec2 operates in a two-step process: pre-training and fine-tuning. During pre-training, it learns to predict the context of waveform samples from a large dataset of multilingual and multitask supervised data. The pre-training includes both learning to align the audio with text transcriptions and learning contextualized representations of audio. This allows it to capture phonetic and linguistic features from the audio.
After pre-training, Wav2Vec2 can be fine-tuned for specific ASR tasks. It has shown impressive results on various ASR benchmarks, reducing the need for extensive amounts of transcribed data, which was traditionally required for ASR systems. Wav2Vec2 has had a significant impact on the development of more accurate and efficient speech recognition models.
HuBERT Model
Since the introduction of the Wav2Vec model, self-supervised learning research in speech has gained momentum. HuBERT is a self-supervised model that allows the BERT model to be applied to audio inputs. Applying a BERT model to a sound input is challenging as sound units have variable length and there can be multiple sound units in each input. In order to apply the BERT model, we need to discretize the audio input. This is achieved through hidden units (Hu), as explained in detail below. Hence the name Hubert. However, before understanding HuBERT, we must get a basic understanding of BERT, as HuBERT is based on it.
Contact Us