Wavelet Transforms in Data Mining
The discrete wavelet transform (DWT) is a signal processing technique that transforms linear signals. The data vector X is transformed into a numerically different vector, Xo, of wavelet coefficients when the DWT is applied. The two vectors X and Xo must be of the same length. When applying this technique to data reduction, we consider n-dimensional data tuple, that is, X = (x1,x2,…,xn), where n is the number of attributes present in the relation of the data set.
The wavelet transforms the data can be truncated and this is helpful in data reduction. If we store a small fraction of the strongest wavelet coefficients then the compressed approximation of the original data can be obtained. For example, the wavelet coefficients larger than some determined threshold can be retained. The coefficients of the wavelet other than the user-determined data are set to 0. The resultant representation of data is very sparse. The computation of the operations is very fast if they are performed in wavelet space. This technique can also be used to remove the noise in the data. This reduces the task of smoothing the main features of the data and wavelet transforms also make the data cleaning very effective. The approximation of the original data can be done if the set of coefficients is given by applying the inverse of the DWT.
The discrete Fourier transform (DFT) is a signal processing technique that involves sines and cosines. The DWT is related to the DFT and it is based on the results of DFT. The DWT gains lossy compression well compared to the DFT. If DWT and DFT of a given data vector have the same number of coefficients then the DWT provides more accurate wavelet coefficients and appropriation of the data. The DWT occupies less space compared to the DFT. There is only one group of DFT but DWT has many groups of it. The most Popular wavelet transforms are Haar-2 and Daubechies-4. The discrete wavelet transform uses a hierarchical pyramid algorithm that halves the data at each iteration, the data is reduced to half, and thus speed of computation of data increases.
The method of the hierarchical pyramid is as follows:
- The input data vector is of length L and L is an integer and is the power of 2. If length L is not the power of 2 then we can append the zeroes at the end of input data vector to make it as power of 2.
- We apply two functions for each transform of the data vector . The first function is to perform the data smoothing, like finding the weighted average of the data vectors. The second function is to find the weighted difference and this retrieves the important features of the input vector .
- We apply the two functions to the X axis pairs of the data points (x2i ,x2i+1). Two different data sets of length L/2 are obtained after applying the two functions. The first data set is the low-frequency version of the original data and the second one is the high frequency data set of it.
- These two functions are applied to the data vectors recursively until the obtained resultant data vectors are of length 2.
- The wavelet coefficients are assigned to the transformed data vectors finally.
Contact Us