How to Calculate Conditional Probability in R?
In this article, we will discuss how to calculate conditional probability in R programming language.
The probability of occurrence of one event conditioned over the occurrence of another event( i.e., an event occurs depending on the condition of another event) is termed as conditional probability. In simple terms, it means if A and B are two events, then the probability of occurrence of Event B conditioned over the occurrence of Event A is given by P(B|A). In another way, it is also the conditional probability of Event B given that event A has already occurred.
Similarly, the probability of occurrence of Event A conditioned over the occurrence of Event B is given by P(A|B), which also represents the conditional probability of Event A given that Event B has already occurred.
The formula for conditional probability can be represented as
P(A|B) = P(A ∩ B) / P(A)
This is valid only when P(A)≠ 0 i.e. when event A is not an impossible event.
Similarly,
P(B|A) = P(A ∩ B) / P(B)
This is valid only when P(B)≠ 0 i.e. when the event B is not an impossible event.
The below figure depicts the Venn diagram representation
Example 1: Computation of Conditional Probability
From a pack of 50 Pokémon cards, a card is drawn at random. These 50 cards have 5 equal sets of red, blue, green, yellow, and black cards respectively and each set has 2 water-type Pokémon with one water type being of high strength and the other one being of medium strength.
Considering A to be the event of drawing a high strength water-type Pokémon card and B to be the event of drawing a red card, what is the probability of drawing a high-strength, water-type Pokémon card with the red card already been drawn?
Solution Steps
Step 1. Probability of drawing a red card (Event B).
P(B) = 10/50 (since there are 10 red cards within a pack of 50 Pokémon cards.)
Step 2: Probability of drawing a high strength water-type Pokémon card (Event A)
P(A) = 5/50 (as there are 5 high-strength water-type Pokémon cards within a pack of 50 cards.)
Step 3 : P( A Ո B) = 1/50 ( as there is one red high strength water-type Pokémon card within a pack of 50 cards)
Step 4: Since event B has already occurred hence there are 10 exhaustive cases and not 50 as earlier. Amongst these 10 red Pokémon cards, there is 1 high-strength, water-type Pokémon card.
Hence, P(A|B) = P( A Ո B) / P(B) = (1/50) / (10/50) = 1/10.
This is the conditional probability of A given that B has already occurred.
Similarly,
P(B|A) = P( A Ո B) / P(A) = (1/50) / (5/50) = 1 / 5
As there can be only 1 red high strength water-type Pokémon card within the high strength water-type Pokémon card already drawn from pack of 50 cards.
Example 2: Computation of Conditional Probability
A store owner has a list of 15 customers. He observes certain patterns in their purchases which are depicted in the table below.
Customers |
Money spent | Frequency |
---|---|---|
1 |
High |
Less |
2 |
Low |
More |
3 |
High |
More |
4 |
High |
Less |
5 |
Low |
Less |
6 |
Low |
More |
7 |
High |
More |
8 |
Low |
Less |
9 |
Low |
Less |
10 |
High |
More |
11 |
Low |
More |
12 |
Low |
Less |
13 |
High |
Less |
14 |
High |
More |
15 |
High |
Less |
Based on the above table, he is interested in finding out
- What is the probability of the customer spending high given that they are purchasing less often?
- What is the probability of the customer spending less given that they are purchasing more often?
- What is the probability of the customer spending less given that they are purchasing less often?
- What is the probability of the customer spending high given that they are purchasing more often?
Solution Steps
1. P(High Spend | Less Frequency)
P(Less Frequency) = 8/15( as from the table,8 times out of 15, frequency is less)
P(High Spend Ո Less Frequency) = 4/15 (as from the table, there are 4 combinations out of 15 with high spend and less frequency)
P(High Spend | Less Frequency) = P(High Spend Ո Less Frequency)/ P(Less Frequency) = (4/15)/( 8/15) = 0.5
2. P(Low Spend | More Frequency)
P(More Frequency) = 7/15( as from the table,7 times out of 15, frequency is less)
P(Low Spend Ո More Frequency) = 3/15 (as from the table, there are 3 combinations out of 15 with low spend and more frequency)
P(Low Spend | More Frequency) = P(Low Spend Ո More Frequency)/ P(More Frequency) = (3/15)/( 7/15) = 0.4285714
Similarly,
3. P(Low Spend | Less Frequency) = 0.5
4. P(High Spend | More Frequency) = 0.5714286
To get the job done first install packages “prob” and “tidyverse” and create a Data frame. Represent the Data frame in table form to represent each combination. Now, count the frequency of the unique combinations from the data frame, depicted as “n” in output. Compute the individual probabilities of each row depicted as “probs” in output. Compute final conditional probability as per the problem at hand.
Below is the R Code used for computation
R
# Library for calculation of conditional probability library (prob) library (tidyverse) Money_Spent < - c ( "High" , "Low" , "High" , "High" , "Low" , "Low" , "High" , "Low" , "Low" , "High" , "Low" , "Low" , "High" , "High" , "High" ) Frequency < - c ( "Less" , "More" , "More" , "Less" , "Less" , "More" , "More" , "Less" , "Less" , "More" , "More" , "Less" , "Less" , "More" , "Less" ) Customer < - c (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15) # Customer Data Frame Customer_Data < - as.data.frame ( cbind (Customer, Money_Spent, Frequency)) Customer_Data % >% count (Money_Spent, Frequency, sort=T) # Creating two-way table from data frame Customer_Data_Table < - addmargins ( table ( "Money_Spent" =Customer_Data$Money_Spent, "Frequency" =Customer_Data$Frequency)) # view table Customer_Data_Table Customer_Data < - probspace (Customer_Data) Customer_Data # Probability of the customer spending high # given that they are purchasing less often Prob (Customer_Data, event=Money_Spent == "High" , given=Frequency == "Less" ) # Probability of the customer spending less # given that they are purchasing more often Prob (Customer_Data, event=Money_Spent == "Low" , given=Frequency == "More" ) # Probability of the customer spending less # given that they are purchasing less often Prob (Customer_Data, event=Money_Spent == "Low" , given=Frequency == "Less" ) # Probability of the customer spending high # given that they are purchasing more often Prob (Customer_Data, event=Money_Spent == "High" , given=Frequency == "More" ) |
Output:
Contact Us