What is Central Limit Theorem
The definition:
The central limit theoram states that if we take large number of samples from any population with finite mean and variance then the distribution of the sample means will follow the normal distribution regradless of the type of the original distribution. Also the mean of these sample means will be equal to the population mean and standard error(standard deviation of the sample means) will decrease with increase in sample size.
Suppose we are sampling from a population with a finite mean and a finite standard deviation (sigma). Then Mean and standard deviation of the sampling distribution of the sample mean can be given as:
\qquad \qquad \mu_{\bar{X}}=\mu \qquad \sigma_{\bar{X}}=\frac{\sigma}{\sqrt{n}}
Where represents the sampling distribution of the sample mean of size n each, and are the mean and standard deviation of the population respectively.
The distribution of the sample tends towards the normal distribution as the sample size increases.
Use of Central Limit Theorem(CLT)
We can use central limit theorem for various purposes in data science project some the key uses are listed below
- Population Parameter Estimation – We can use CLT to estimate the parameters of the population like population mean or population proportion based on a sampled data.
- Hypothesis testing – CLT can be used for various hypothesis assumptions tests as It helps in constructing test statistics, such as the z-test or t-test, by assuming that the sampling distribution of the test statistic is approximately normal.
- Confidence interval – Confidence interval plays a very important role in defing the range in which the population parameter lies. CLT plays a very crucial role in determining the confidence interval of these population parameter.
- Sampling Techniques – sampling technique help in collecting representative samples and generalize the findings to the larger population. The CLT supports various sampling techniques used in survey sampling and experimental design.
- Simultion and Monte Carlo Methods – This methods involve generating random samples from known distributions to approximate the behavior of complex systems or estimate statistical quantities. CLT plays a very key role in the simulation and monte carlo methods.
Python – Central Limit Theorem
Statistics is an important part of Data science projects. We use statical tools whenever we want to make any inference about the population of the dataset from a sample of the dataset, gather information from the dataset, or make any assumption about the parameter of the dataset. In this article, we will talk about one of the important statical tools central limit theorem.
Contact Us