Kh-means Clustering: This technique divides the dataset into k groups, with each data point falling into the group that has the closest mean.
According to the research, when the number of clusters k is set to 2, k-means clustering clearly divides data from the lemniscate (infinity form).
The approach still produces a respectable clustering result when k is increased to 4, breaking the dataset into more focused, smaller clusters.
k-medication Grouping:
medoids are used in place of means in a manner similar to k-means. The data point in a cluster with the most central location is called a medoid.
For k = 2, k-medoids clustering likewise offers a distinct separation using the lemniscate dataset.
Nonetheless, the medoid, the most characteristic site of the clusters, is where they are generated. assemble.
Similar to k-means, k-medoids divide data for k = 4, but clusters are generated around the most central data points rather than the mean.
Applications with Noise Using Density-Based Spatial Clustering: DBSCAN
On the basis of dense data point regions, it creates clusters. As seen by the clusters created in the samples given, DBSCAN is less sensitive to outliers than k-means and k-medoids.
DBSCAN found four clusters in the lemniscate example, most likely indicating dense regions divided by less dense or noisy regions.
Comparative Observations:
The report provides visuals to show that, regardless of whether a natural cluster count is more or lower than k, k-means and k-medoids are sensitive to the choice of k and will always divide the data into k clusters.
Since DBSCAN doesn’t need a set number of clusters, it can discover any number of clusters depending on data density, which in some circumstances might lead to a more logical clustering.
The visual results show that if the right k is selected, k-means and k-medoids may successfully identify clusters for geometric designs with distinct separations (such the lemniscate).
The benefit of DBSCAN, however, lies in its capacity to deal with noise and locate clusters without a specified k.
Later on, we’ll implement DBSCAN in Python and upload it to the upcoming releases.