1 year ago
#380426
hua
C4.5:When to stop growing tree?
I have a dataset include 10 continuous attributes and 500 cases. When I growing a decision tree, I got a problem: when to stop growing tree?
This is a tree that I build in java with c4.5, in attribute8, the subdataset is in second picture and sort by attribute8 value, but it can not split to one class on each side, how to solve this problem?
I have two idea to solve it, but I don't know which one is correct.
1.If subdataset cases less then a threshold, maybe 2% (in this example, mean less than 10 cases), then stop to growing and take the most class to the node.
2.If the gain ratio of this node is less than a threshold, then don't split this node.
3.Split dataset to 250/250, one for training, one for testing. If one split node accuracy is higher than no split, then build the node, and vice versa.
java
decision-tree
c4.5
0 Answers
Your Answer