A new approach to understanding the Central Limit Theorem

 In this article , we try to make sense of what Central limit theorem implies looking into it with two perspectives , the theoretical perspective and the experimental perspective .



I hope you might have surely thought about statisticians being too attached in some sense to the normal distribution :P

We , try to understand why this obsession makes sense .... The way we try to understand this is using an experiment . So , the experiment is drawing a sample of say 20 points from the uniform distribution and computing mean (average) for these points and plotting that point on a separate Cartesian plane. Doing this drawing of 20 points from the distribution exercise repeatedly for say n times and computing mean and plotting it , we finally get the well known bell shaped curve for the mean values we plotted . 

And Boom , we just understood the central limit Theorem !!



Food for thought !!

Though the distribution we were sampling from is basically not a normal distribution , but what we get  by plotting the mean of samples from it is a normal curve ... What is the relevance of this Theorem ??

 

Graphical Approach  

 Lets try and understand the same using the graphical approach :

 

 

  

 So this is a Uniform distribution generated using python . So from this distribution we take a sample of points and compute mean for it . We take such samples (say 20 points) repeatedly and compute means and plot on the other graph below and what we finally get is a normal distribution .





So , here it is a normal distribution , which would obviously take a more bell shape as we increase the sample size from 20 points to even higher and then keep plotting the means , we would get a normal distribution.

 

 

The python code for plotting these and generating the distributions can be downloaded from the following link :

https://drive.google.com/file/d/1NZMuTu0z-Q6AfdufFwt0h3JNncZsEJUn/view?usp=sharing

 

 

So , now we move on to understanding the mathematical aspect of the theorem :

 

 Mathematical Approach  

 So diving into the mathematics of the Central Limit Theorem .

 Let {X1 , X2 ,...., Xn) be a random sample of size n which are independent and identically distributed with a expected value of and a variance of .

 

Suppose , we want to calculate = (X1 + X2 + .... + Xn ) / n

So we have a theorem of the Law of Large Numbers , which says that as n tends to infinity () , sample mean tends to the population mean .

For reading about Law of Large Numbers in detail , you can check another post of mine on the following link (Highly recommended for better understanding):

https://statisticsexplained.blogspot.com/2020/06/law-of-large-numbers-explained-using.html

 

I am also attaching the graphical image of experiment done in the Law of Large Numbers post , to make a connection between the Central limit theorem and the law of large numbers .



 A brief on what the experiment was in this case :

It was a dice rolling experiment and we noted that as  , we clearly see that the sample mean () converges to the population mean , which  is 3.5 in this experiment ( let Xi be the roll of a dice , assuming a fair dice  ,                    E[X] = (1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5 )

 Now , the main part is carefully look at the shape of the convergence , you would notice that the convergence pattern follows Normal Distribution .

So , I think we did a fine job intuitively attempting to relate the Law of Large Numbers and the Central Limit Theorem.

 

So , what is the contribution of the central limit theorem is the main question that we are interested in examining.( Spoiler : Its something to do with the shape while convergence !!)

The theorem tries to answer how exactly does the process of convergence takes place (intuitively , trying to comment on how fast or slow the convergence happens) , more precisely , it tells that as n gets larger , the distribution of and , when multiplied by a constant   approximates a Normal distribution with 0 mean and variance . So , putting it all together , it implies :

                              

 For a large enough n , we can further go on to prove that the distribution of
is Normal with  mean and variance

So the major crux is that the distribution of approaches normality regardless of the distribution of the individual Xi . For our experiment , we took samples from a Uniform Distribution , but it turned out that for the means of the samples , we got an approximation to the normal distribution in the simulation we ran using python .

Now , there is a small point that needs a bit of clarity, why did we multiply the constant  and not any other constant.

We analyze :  As , the limit  . if instead of    , we multiplied by n or


 

The punchline for any distribution is  :

"Even if you're not normal, the mean is normal ".

 

Happy Learning 

Sahaj Thareja

Comments

  1. Well explained but I didnt get the idea for multiplying sqroot n

    ReplyDelete

Post a Comment

Popular posts from this blog

Law of Large Numbers Explained using python : A practical approach

Normal Distribution : A new perspective