CART (Classification and Regression Tree) is a machine learning algorithm used to build decision trees for predictive modeling. It is designed to handle both classification problems (predicting categorical outcomes) and regression problems (predicting continuous outcomes).
Key Characteristics of CART:
Binary Tree Structure: CART always splits data into two branches at each decision node, resulting in a binary tree.
Splitting Criteria:
For classification tasks, CART uses Gini Impurity to decide the best splits.
For regression tasks, it minimizes the mean squared error (MSE) to find the optimal splits.
Recursive Partitioning: The algorithm recursively divides the dataset into smaller subsets based on feature values to create a tree structure.
Leaf Nodes: Each leaf node in the tree represents a final prediction—either a class label (classification) or a numerical value (regression).
Pruning: CART employs pruning techniques to reduce the size of the tree, preventing overfitting and improving model generalization.
CART is a foundational technique for advanced machine learning models like Random Forest and Gradient Boosting Trees. It is valued for its simplicity and interpretability.