, , , ,

Purpose and Types of Conventions in Scikit-Learn

admin Avatar

Scikit-Learn offers a uniform API with three interfaces: estimator interface for building and fitting the models, predictor interface for making predictions, and transformer interface for converting data. The convention process is performed to ensure that the API is compliant with the following objectives:

  • Consistency: All objects, basic or composite, must share a consistent interface with limited methods.
  • Inspection: Constructor parameters and values are stored as public attributes exposed.
  • Non-proliferation of classes: Datasets must be NumPy arrays or Scipy sparse matrices; hyper-parameters should be standard Python strings to minimize framework code.
  • Composition: Algorithms, whether as sequences, transformations, or meta-algorithms, must be implemented and composed using existing building blocks for efficiency.
  • Sensible defaults: Scikit-Learn provides default values for user-defined parameters to ensure operations are performed sensibly, offering a baseline solution for various tasks.

Scikit-Learn offers various conventions, as mentioned below.

Type Casting

The following example displays a type conversion from ‘int64’ to ‘float64’ using fit_transform().

import numpy as np
from sklearn.random_projection import GaussianRandomProjection

ran = np.random.RandomState(0)
data = ran.rand(100, 5000)
data = np.array(data, dtype = 'int')
print(data.dtype)

data_transformer = GaussianRandomProjection()
new_data = data_transformer.fit_transform(data)
print(new_data.dtype)

Output

int64
float64
Refitting and Updating Parameters

Hyper-parameters can be updated and refitted after it has been built using set_params(), as shown in the following example.

import numpy as np
from sklearn.datasets import load_iris
from sklearn.svm import SVC

data, target = load_iris(return_X_y = True)
classifier = SVC()

classifier.set_params(kernel = 'linear').fit(data, target)
print(classifier.predict(data[:5]))

classifier.set_params(kernel = 'rbf', gamma = 'scale').fit(data, target)
print(classifier.predict(data[:5]))

Output

[0 0 0 0 0]
[0 0 0 0 0]
Multiclass and Multilabel Fitting

Multiclass fitting depends on the target data format for learning and prediction tasks. In the following example, the classifier is trained on one-dimensional multiclass labels, predicting corresponding multiclass outputs with predict().

from sklearn.svm import SVC
from sklearn.multiclass import OneVsRestClassifier

data = [[1, 1], [2, 3], [3, 4], [4, 1], [2, 3]]
target = [1, 1, 0, 2, 1]

classifier = OneVsRestClassifier(estimator = SVC(gamma = 'scale', random_state = 0))
print(classifier.fit(data, target).predict(data))

Output

[1 1 0 2 1]

Fitting a two-dimensional array of binary label indicators is possible, as represented in the following example.

from sklearn.svm import SVC
from sklearn.preprocessing import LabelBinarizer
from sklearn.multiclass import OneVsRestClassifier

data = [[1, 1], [2, 3], [3, 4], [4, 1], [2, 3]]
target = [1, 1, 0, 2, 1]

target = LabelBinarizer().fit_transform(target)
classifier = OneVsRestClassifier(estimator = SVC(gamma = 'scale', random_state = 0))
print(classifier.fit(data, target).predict(data))

Output

[[0 1 0]
 [0 1 0]
 [0 0 0]
 [0 0 1]
 [0 1 0]]

Similarly, multilabel fitting allows instances to obtain multiple assigned labels, as indicated in the following example.

from sklearn.svm import SVC
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.multiclass import OneVsRestClassifier

data = [[1, 1], [2, 3], [3, 4], [4, 1], [2, 3]]
target = [[1, 3], [1, 1], [2, 4], [1, 3, 5], [1, 5]]

target = MultiLabelBinarizer().fit_transform(target)
classifier = OneVsRestClassifier(estimator = SVC(gamma = 'scale', random_state = 0))
print(classifier.fit(data, target).predict(data))

Output

[[1 0 1 0 0]
[1 0 0 0 0]
[1 0 0 0 0]
[1 0 1 0 1]
[1 0 0 0 0]]
References
  1. Hackeling, G. (2017). Mastering Machine Learning with scikit-learn, 2nd Edition. Packt Publishing Ltd.
  2. Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition. O’Reilly Media, Inc.
  3. Tutorials Point. Scikit Learn Tutorial. Retrieved November 20, 2025, from https://www.tutorialspoint.com/.

Tagged in :

admin Avatar

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Love