Prior Probability

Understanding Prior probability and its study is important as it helps us to combine the new information with the past data to make better decisions and improve accuracy. Prior probability forms as a foundation of Bayesian Theorem which allows us to integrate the new data with the old data to improve the estimation accuracy.

In this article, we will take a deeper look into prior probability, its applications in various fields and some examples for understanding it better.

Table of Content

What is Prior Probability?

Significance of Prior Probability in Bayesian Statistics

Classification of Priors

Informative Priors
Weakly Informative Priors
Non-informative Priors
Improper Priors

Applications of Prior Probability

Prior probability is defined as the initial assessment or the likelihood of the event or an outcome before any new data is considered. In simple words, it tells us about what we know based on previous knowledge or experience.

For example, let’s take a situation in which we have data which describes that in a month 30% of the days are rainy, then the prior probability of rain on any random day of that month is 30%. Prior probability plays a significant role in Bayesian Statistics, which allows the prior probability to combine with the new data to produce new understandings that would eventually help in improving the accuracy.

Prior probability is used in various fields like machine learning, and medical diagnosis where the decisions can be taken from the data available. Also, prior probability allows us to change or update our beliefs as and when the new data is made available.

Significance of Prior Probability in Bayesian Statistics

In Bayesian statistics, prior probability plays an important role because it represents the initial beliefs based on the available or past data before any future data is considered this makes the foundation of the Bayesian inference. Bayesian methods combine the prior probability with the likelihood of the observed data, that in result produces the posterior probability that reflects the updated knowledge.

Prior Probability

This approach of the Bayesian methods helps in improving the overall estimation accuracy with the limited data and prevents overfitting. In this way prior probability enables adaptive learning as it updates continuously as the new data arrives.

The expression for the Bayesian Theorem is defined as follows:

P(A|B) = P(A)P(B|A) / ∑ P(A_i)P(B|A_i)

where,

P(A|B) is posterior probability which is defined as the probability of event A given that event B has occurred.
P(B|A) is likelihood which is defined as the probability of event B given that event A has occurred.
P(A) is prior probability which defines the initial probability of event A before any event B is considered.
P(B) is marginal likelihood which is defined as the total probability of event B under all circumstances.

It can be calculated as [Tex]\sum_i P(B|A_i)\cdot P(A_i)[/Tex], if there exists multiple mutually exclusive events [Tex]A_i[/Tex].

In Bayesian statistics, priors are classified based on the amount of information content in it. Below discussed are some of the types of priors:

Informative Priors
Weakly Informative Priors
Non-informative Priors
Improper Priors

Informative Priors

These kind of priors have detailed knowledge or are decided from the expert opinions. These priors are chosen based on the past or historical data, or under expert guidance. These priors have a significant impact on the posterior distribution. These kind of priors are useful only when we have strong information that can drive the analysis.

Weakly Informative Priors

These kind of priors are in-between informative and non-informative priors. They have some prior knowledge but cannot eventually influence the posterior distribution. These priors provide some regularization and also prevents from noise fitting but it still allows the data to influence the posterior distribution. Normal prior with the high variance can be considered as weakly informative priors.

Non-informative Priors

Non informative priors are also known as uninformative priors. These kind of priors have very little or practically no prior knowledge about the parameter. They have a minimum influence to posterior distribution, allowing the data to primarily drive the inference. Uniform prior is an example of non-informative prior that assigns equal probabilities to all the possible outcomes, reflecting the lack of prior knowledge.

Improper Priors

These are non-informative priors but it does not integrate over one parameter space, i.e. they do not have a valid probability distribution. These kind of priors are still used in Bayesian statistics as long as the resulting posterior distribution remains proper which integrates over one parameter space. These priors are generally used for parameters with unbounded ranges such as it uses 1/θ over a parameter θ.

Various application of Prior Probability includes:

Medical Diagnosis: Prior probabilities are used in medical diagnosis in order to determine the likelihood of the disease before testing.
Spam Filtering: Prior probabilities are used in email filtering to classify an email as a spam or not spam based on the pervious historical data.
Financial Forecasting: Investors consider the prior probability in order to assess the risk of the investments before considering market trends and data.
Machine Learning: In the field of Machine Learning, prior probabilities are integrated with various algorithms in order to improve model performance and its accuracy.

Problem 1: Given that 2% of the emails are spam. What is the prior probability of an email being spam?

Solution:

As it is given that 2% of the emails are spam, so the prior probability of an email being spam is given below:

∴ P(Spam) = 2% = 0.02

Problem 2: Consider a medical scenario where the historical data shows that 1% of the total population has the certain disease, then find the prior probability of the patient having certain disease.

Solution:

As it is given that 1% of the total population has the certain disease, so the prior probability of the patient having certain disease is expressed below:

∴ P(Disease) = 1% = 0.01

Problem 3: Imagine the situation where the bank wants to assess the risk of a borrower defaulting on a loan. Historically, 5% of borrowers default on their loans. The bank uses a credit scoring system that correctly predicts defaults with 85% sensitivity (true positive rate) and correctly predicts non-defaults with 90% specificity (true negative rate). If a borrower receives a positive result on this credit scoring system indicating potential default then find what is the probability that the borrower will actually default on the loan?

Solution:

Given,

Prior probability that a borrower defaults is 5%
Sensitivity of the credit scoring system is 85% which means that it correctly identifies 85% of those who will default
Specificity is 90% which means that it correctly identifies 90% of those who will not default
Using these values, we calculate the overall probability of positive result from the credit scoring system:

∴ P(Positive) = P(Positive | Default) . P(Default) + P(Positive | No Default) . P(No\ Default)

Now, substituting the values given, we get

∴ P(Positive) = (0.85 x 0.05) + ((1-0.9) x (1-0.05))

∴ P(Positive) = (0.85 x 0.05) + (0.1 x 0.95)

∴ P(Positive) = 0.0425 + 0.095

∴ P(Positive) = 0.1375

Now, using Bayes Theorem we find the posterior probability

[Tex]\therefore P(Default|Positive) = \frac{P(Positive|Default) \cdot P(Default)}{P(Positive)}[/Tex]

Substituting the values in the above expression

∴P(Default∣Positive) = (0.85×0.05)/0.1375

∴P(Default∣Positive) = 0.0425/0.1375

∴ P(Default | Positive) = 0.3091

Therefore, after a positive result from the credit scoring system, the probability that the borrower will actually default on the loan is approximately 30.91%.

What is the Role of Prior Probability in Decision Theory?

In decision theory, prior probability helps in making the decisions through a probabilistic framework that combines the prior knowledge and the new evidence. It leads to more structural and informed decision making process.

What are Empirical and Subjective Priors, and How do they Differ from Each Other?

Empirical priors are the priors that are derived from the data itself or from the larger dataset similar to the one that is being analyzed whereas subjective priors are derived based on individual’s perspectives or personal beliefs.

How do improper priors affect posterior distribution?

Improper priors can only be used when it results to the proper posterior distribution, which means that it integrates to one parameter and can be interpreted as a proper probability distribution. Care must be taken in order to ensure that the posterior is valid as improper posterior are not valid.

Can prior probabilities be non-constant?

Yes, it is possible that prior probabilities can be non-constant that means it depends over additional parameters. For example, in spatial statistics, prior probabilities are not constant as they change or vary across different regions with different geographical information.

What is Prior Probability?

Significance of Prior Probability in Bayesian Statistics

Classification of Priors

Informative Priors

Weakly Informative Priors

Non-informative Priors

Improper Priors

Applications of Prior Probability

Solved Problems on Prior Probability

Problem 1: Given that 2% of the emails are spam. What is the prior probability of an email being spam?

Problem 2: Consider a medical scenario where the historical data shows that 1% of the total population has the certain disease, then find the prior probability of the patient having certain disease.

Frequently Asked Questions (FAQs)

What is the Role of Prior Probability in Decision Theory?

What are Empirical and Subjective Priors, and How do they Differ from Each Other?

How do improper priors affect posterior distribution?

Can prior probabilities be non-constant?

Contact Us