Scikit-Learn provides several key functions for evaluating the performance of clustering algorithms effectively and efficiently, as explained below.
1. Adjusted Rand Index
This algorithm measures similarity between two clusters by counting pairs of samples in similar or different clusters. The following example shows how to use this performance evaluation algorithm.
from sklearn.metrics.cluster import adjusted_rand_score
labels_defined = [0, 0, 1, 1, 1, 1]
labels_prediction = [0, 0, 2, 2, 3, 3]
print(adjusted_rand_score(labels_defined, labels_prediction))
Output
0.4444444444444444
2. Mutual Information
Mutual Information calculates agreement between two assignments, disregarding permutations, with several available versions, as described below.
2.1. Normalized Mutual Information (NMI)
The example below shows how to use the NMI algorithm for performance evaluation.
from sklearn.metrics.cluster import normalized_mutual_info_score
labels_defined = [0, 0, 1, 1, 1, 1]
labels_prediction = [0, 0, 2, 2, 3, 3]
print(normalized_mutual_info_score(labels_defined, labels_prediction))
Output
0.7336804366512113
2.2. Adjusted Mutual Information (AMI)
The following example illustrates how to use the AMI algorithm for performance evaluation.
from sklearn.metrics.cluster import adjusted_mutual_info_score
labels_defined = [0, 0, 1, 1, 1, 1]
labels_prediction = [0, 0, 2, 2, 3, 3]
print(adjusted_mutual_info_score(labels_defined, labels_prediction))
Output
0.6153846153846159
3. Fowlkes-Mallows Score
This algorithm quantifies similarity between two clusters of points as the geometric mean of pairwise precision and recall. The example below represents how to use this algorithm.
from sklearn.metrics.cluster import fowlkes_mallows_score
labels_defined = [0, 0, 1, 1, 1, 1]
labels_prediction = [0, 0, 2, 2, 3, 3]
print(fowlkes_mallows_score(labels_defined, labels_prediction))
Output
0.6546536707079771
4. Silhouette Coefficient
This function in Scikit-Learn calculates the mean Silhouette Coefficient using intra-cluster distance and mean nearest-cluster distance for each sample. The following example shows how to use it based on the iris dataset.
from sklearn import datasets
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
dataset = datasets.load_iris()
data = dataset.data
kmeans = KMeans(n_clusters = 3, random_state = 1).fit(data)
labels = kmeans.labels_
print(silhouette_score(data, labels, metric = 'euclidean'))
Output
0.551191604619592
5. Contingency Matrix
This matrix reports intersection cardinality for trusted true-predicted pairs in a square contingency format. The example below illustrates how to apply it for the performance evaluation purpose.
from sklearn.metrics.cluster import contingency_matrix
data = ['a', 'a', 'a', 'b', 'b', 'b']
target = [0, 0, 2, 1, 1, 0]
print(contingency_matrix(data, target))
Output
[[2 0 1]
[1 2 0]]
References
- Hackeling, G. (2017). Mastering Machine Learning with scikit-learn, 2nd Edition. Packt Publishing Ltd.
- Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition. O’Reilly Media, Inc.
- Tutorials Point. Scikit Learn Tutorial. Retrieved November 20, 2025, from https://www.tutorialspoint.com/.

Leave a Reply