All About Machine Learning: k-means clustering - Wikipedia, the free encyclopedia

Friday, November 21, 2014

k-means clustering - Wikipedia, the free encyclopedia

k-means clustering - Wikipedia, the free encyclopedia
k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.

Given a set of observations (x₁, x₂, …, x_n), where each observation is a d-dimensional real vector, k-means clustering aims to partition the n observations into k (≤ n) sets S = {S₁, S₂, …, S_k} so as to minimize the within-cluster sum of squares (WCSS). In other words, its objective is to find:

\underset{\mathbf{S}} {\operatorname{arg\,min}} \sum_{i=1}^{k} \sum_{\mathbf x \in S_i} \left\| \mathbf x - \boldsymbol\mu_i \right\|^2

where μ_i is the mean of points in S_i.

Standard algorithm

The most common algorithm uses an iterative refinement technique
Initialization methods

Commonly used initialization methods are Forgy and Random Partition.[9] The Forgy method randomly chooses k observations from the data set and uses these as the initial means. The Random Partition method first randomly assigns a cluster to each observation and then proceeds to the update step, thus computing the initial mean to be the centroid of the cluster's randomly assigned points. The Forgy method tends to spread the initial means out, while Random Partition places all of them close to the center of the data set. According to Hamerly et al.,[9] the Random Partition method is generally preferable for algorithms such as the k-harmonic means and fuzzy k-means. For expectation maximization and standard k-means algorithms, the Forgy method of initialization is preferable.

Demonstration of the standard algorithm
1) k initial "means" (in this case k=3) are randomly generated within the data domain (shown in color).
2) k clusters are created by associating every observation with the nearest mean. The partitions here represent theVoronoi diagram generated by the means.
3) The centroid of each of thek clusters becomes the new mean.
4) Steps 2 and 3 are repeated until convergence has been reached.

As it is a heuristic algorithm, there is no guarantee that it will converge to the global optimum, and the result may depend on the initial clusters. As the algorithm is usually very fast, it is common to run it multiple times with different starting conditions. However, in the worst case, k-means can be very slow to converge: in particular it has been shown that there exist certain point sets, even in 2 dimensions, on which k-means takes exponential time, that is

2 Ω(n)

, to converge.^[10] These point sets do not seem to arise in practice: this is corroborated by the fact that the smoothed running time of k-means is polynomial.^[11]

The "assignment" step is also referred to as expectation step, the "update step" as maximization step, making this algorithm a variant of the generalized expectation-maximization algorithm.

Read full article from k-means clustering - Wikipedia, the free encyclopedia

All About Machine Learning

Friday, November 21, 2014

k-means clustering - Wikipedia, the free encyclopedia

Standard algorithm

No comments:

Post a Comment

Popular Posts