, , , ,

Purpose and Types of Boosting Methods in Scikit-Learn

admin Avatar

Boosting methods create an ensemble model incrementally by sequentially training base model estimators. They combine several weak learners trained over multiple iterations to build a powerful ensemble. Two main boosting methods in Scikit-Learn are included in this process: AdaBoost and Gradient Tree Boosting.

1. AdaBoost

This method is a successful boosting ensemble method that adjusts instance weights, allowing the algorithm to focus less on certain instances when building subsequent models. It can be used for both classification and regression.

1.1. Classification With AdaBoost

Scikit-Learn builds an AdaBoost classifier using the base_estimator parameter. If set to none, it defaults to DecisionTreeClassifier(max_depth=1) as the base estimator. The following example shows how to build an AdaBoost classifier and also predict and check its score.

from sklearn.datasets import make_classification
from sklearn.ensemble import AdaBoostClassifier

data, target = make_classification(n_samples = 2000, n_features = 10, n_informative = 2, n_redundant = 0, random_state = 0, shuffle = False)

classifier = AdaBoostClassifier(n_estimators = 200, random_state = 0)
classifier.fit(data, target)
print(classifier.score(data, target))
print(classifier.predict([[1, 1, 1, 0, 2, 3, 0, 1, 2, 2]]))

Output

0.9905
[1]

The example below also illustrates how to build this classifier based on the Pima-Indian dataset.

from pandas import read_csv
from sklearn.model_selection import KFold
from sklearn.ensemble import AdaBoostClassifier
from sklearn.model_selection import cross_val_score

path = "pima-indians-diabetes.csv"
headers = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataset = read_csv(path, names = headers)
data = dataset.values[1:, 0:8]
target = dataset.values[1:, 8]

kfold = KFold(n_splits = 10)
classifier = AdaBoostClassifier(n_estimators = 150)
results = cross_val_score(classifier, data, target, cv = kfold)
print(results.mean())

Output

0.7643369788106631

The “pima-indians-diabetes.csv” dataset can be downloaded using the following link:

https://github.com/npradaschnor/Pima-Indians-Diabetes-Dataset/blob/master/diabetes.csv
1.2. Regression With AdaBoost

Scikit-Learn’s AdaBoost regressor employs parameters similar to its AdaBoost classifier for regression model creation. The following example shows how to construct this regressor and also predict new values using the predict() method.

from sklearn.datasets import make_regression
from sklearn.ensemble import AdaBoostRegressor

data, target = make_regression(n_features = 10, n_informative = 2, random_state = 0, shuffle = False)

regressor = AdaBoostRegressor(random_state = 0, n_estimators = 100)
regressor.fit(data, target)
print(regressor.predict([[0, 2, 1, 0, 1, 0, 1, 0, 2, 2]]))

Output

[75.8528769]
2. Gradient Tree Boosting

This method, also called Gradient Boosted Regression Trees (GRBT), generalizes boosting for arbitrary differentiable loss functions, creating an ensemble of weak prediction models. It effectively addresses regression and classification problems while handling mixed-type data efficiently.

2.1. Classification With Gradient Tree Boost

The Scikit-Learn library offers this classifier for building Gradient Tree Boost classifiers. The key parameter is ‘loss’, which can be set to ‘deviance’ for probabilistic classification. The n_estimators parameter determines the number of weak learners, while the learning_rate parameter, within (0.0, 1.0], mitigates overfitting through shrinkage. The following examples show how to construct this classifier, using the random (fitting the classifier with 100 learners) and Pima-Indian (fitting the classifier with 150 learners) datasets.

Example 1 with the random dataset:

from sklearn.datasets import make_hastie_10_2
from sklearn.ensemble import GradientBoostingClassifier

data, target = make_hastie_10_2(random_state = 0)
data_train, data_test = data[:5000], data[5000:]
target_train, target_test = target[:5000], target[5000:]

classifier = GradientBoostingClassifier(n_estimators = 100, learning_rate = 1.0, max_depth = 1, random_state = 0).fit(data_train, target_train)
print(classifier.score(data_test, target_test))

Output

0.9171428571428571

Example 2 with the Pima-Indian dataset:

from pandas import read_csv
from sklearn.model_selection import KFold
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_val_score

path = "pima-indians-diabetes.csv"
headers = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataset = read_csv(path, names = headers)
data = dataset.values[1:, 0:8]
target = dataset.values[1:, 8]

kfold = KFold(n_splits = 10, random_state = None)
classifier = GradientBoostingClassifier(n_estimators = 150, max_features = 5)
results = cross_val_score(classifier, data, target, cv = kfold)
print(results.mean())

Output

0.761637047163363
2.2. Regression With Gradient Tree Boost

This regressor enables gradient tree boosting with customizable loss functions, defaulting to least squares for regression. The example below illustrates how to build this regressor and also find the mean squared error.

from sklearn.datasets import make_friedman1
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error

data, target = make_friedman1(n_samples = 5000, random_state = 0, noise = 1.0)
data_train, data_test = data[:4000], data[4000:]
target_train, target_test = target[:4000], target[4000:]

regressor = GradientBoostingRegressor(n_estimators = 100, learning_rate = 0.1, max_depth = 1, random_state = 0, loss = 'squared_error').fit(data_train, target_train)
print(mean_squared_error(target_test, regressor.predict(data_test)))

Output

5.036557893137746
References
  1. Hackeling, G. (2017). Mastering Machine Learning with scikit-learn, 2nd Edition. Packt Publishing Ltd.
  2. Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition. O’Reilly Media, Inc.
  3. Tutorials Point. Scikit Learn Tutorial. Retrieved November 20, 2025, from https://www.tutorialspoint.com/.

Tagged in :

admin Avatar

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Love