Torchaudio Dataset
Loading demo yes_no audio dataset in torchaudio using Pytorch.
Yes_No dataset is an audio waveform dataset, which has values stored in form of tuples of 3 values namely waveform, sample_rate, labels, where waveform represents the audio signal, sample_rate represents the frequency and label represent whether Yes or No.
- Import the torch and torchaudio packages. (Install using pip install torchaudio, if necessary)
- Use the torchaudio function with the datasets accessor, followed by the dataset name.
- Now, pass the path in which the dataset has to be stored and specify download = True to download the dataset. Here, ‘./’ specifies the root directory.
- Now, iterate over the loaded dataset using a for loop, and access the 3 values stored in a tuple to see the sample of the dataset.
To load your custom data:
Syntax: torch.utils.data.DataLoader(data, batch_size, shuffle)
Parameters:
- data – audio dataset or the path to the audio dataset
- batch_size – for large dataset, batch_size specifies how much data to load at once
- shuffle – a bool type. Setting it to True will shuffle the data.
Python3
# import the torch and torchaudio dataset packages. import torch import torchaudio # access the dataset in torchaudio package using # datasets followed by dataset name. # './' makes sure that the dataset is stored # in a root directory. # download = True ensures that the # data gets downloaded yesno_data = torchaudio.datasets.YESNO( './' , download = True ) # loading the first 5 data from yesno_data for i in range ( 5 ): waveform, sample_rate, labels = yesno_data[i] print ( "Waveform: {}\nSample rate: {}\nLabels: {}" . format ( waveform, sample_rate, labels)) |
Output:
Loading Data in Pytorch
In this article, we will discuss how to load different kinds of data in PyTorch.
For demonstration purposes, Pytorch comes with 3 divisions of datasets namely torchaudio, torchvision, and torchtext. We can leverage these demo datasets to understand how to load Sound, Image, and text data using Pytorch.
Contact Us