A Decision tree is a supervised machine learning algorithm that is very popular because of its logic of the predictions. Most of the boosting algorithms are built based on decision trees. A decision tree is a tree-like algorithm that starts with a root node all the way to the leaves which are the output values. On each node, the model decides about the splitting of the dataset and forms a tree-like structure. The accuracy of the model is defined by many factors and one of them is the depth of the decision tree. We will explore the depth of the decision tree and will see how it affects the accuracy of the model.
Formation of Decision Tree
Before going to understand what is the depth of the decision tree, we need to understand the decision trees themselves and the formation of the decision tree. A Decision tree is a machine learning algorithm, that is used for both classification and regression problems. One thing to note is that it is a supervised machine learning algorithm which means it takes the training dataset to understand the relation between input and output variables and the prediction based on the input values.
A simple decision tree looks like this:
If you are interested in learning how to visualize decision trees, then go to the article about visualization of decision trees.
What is the Depth of the Decision Tree
The depth of the Decision tree is the total number of layers in the tree. In other words, a decision tree with depth one will have one node and then leaves. A decision tree with depth 2 will have one extra layer in between the root node and leaves making it a total of two. A simple definition of the depth of the decision tree is the length of the longest path from a root to a leaf.
In the Sklearn module, the max_depth parameter is used to set the value for the depth of the decision tree. If this value is not set to a specific value, then the tree will be formed until all leaves are pure.
What are the Other Parameters of the Decision Tree?
A decision tree has various parameters that all contribute to making it one of the best decision-making algorithms. Here is a list of all the possible parameters that a decision tree takes.
- Tree instance
- The number of outputs
- Max features
- Class weight
- min impurity
- max-leaf node
- random state
- min weight fraction in leaf
- min sample in leaf
- min sample in the split
- max depth
Decision trees are supposed to be one of the most important and fundamental algorithms in machine learning as they are the foundation of boosting algorithms. The decision trees have given birth to many important models like random forests, extra tree regressors, and boosting algorithms. Here we learned one of the parameters of the decision tree known as the depth of the tree.