1 year ago

#380426

test-img

hua

C4.5:When to stop growing tree?

I have a dataset include 10 continuous attributes and 500 cases. When I growing a decision tree, I got a problem: when to stop growing tree?

This is a tree that I build in java with c4.5, in attribute8, the subdataset is in second picture and sort by attribute8 value, but it can not split to one class on each side, how to solve this problem?

I have two idea to solve it, but I don't know which one is correct.

1.If subdataset cases less then a threshold, maybe 2% (in this example, mean less than 10 cases), then stop to growing and take the most class to the node.

2.If the gain ratio of this node is less than a threshold, then don't split this node.

3.Split dataset to 250/250, one for training, one for testing. If one split node accuracy is higher than no split, then build the node, and vice versa.

enter image description here

enter image description here

java

decision-tree

c4.5

0 Answers

Your Answer

Accepted video resources