How to Calculate Information Gain in Decision Tree?
Answer: To calculate information gain in a decision tree, subtract the weighted average entropy of child nodes from the entropy of the parent node.
To calculate information gain in a decision tree, follow these steps:
- Calculate the Entropy of the Parent Node:
- Compute the entropy of the parent node using the formula: Entropy=−∑i=1 pi ⋅log2(pi)
- Where pi is the proportion of instances belonging to class i, and c is the number of classes.
- Split the Data:
- Split the dataset into subsets based on the values of a selected attribute (feature).
- Calculate the Entropy of Child Nodes:
- For each subset (child node), calculate its entropy using the same formula as step 1.
- Calculate the Weighted Average Entropy of Child Nodes:
- Calculate the weighted average entropy of the child nodes using the formula: Weighted Average Entropy=
- Where Nj is the number of instances in the jth child node, N is the total number of instances, and m is the number of child nodes.
- Calculate Information Gain:
- Information Gain is the difference between the entropy of the parent node and the weighted average entropy of the child nodes: Information Gain=Entropy(Parent)−Weighted Average Entropy(Children)Information Gain=Entropy(Parent)−Weighted Average Entropy(Children)
- Select the Attribute with the Highest Information Gain:
- Choose the attribute (feature) that yields the highest information gain as the splitting criterion for the current node in the decision tree.
Contact Us