general-discussion > Error in running 'niak_kmeans_clustering'
Showing 1-3 of 3 posts
Display:
Results per page:
Apr 11, 2011  03:04 PM | mohammad z
Error in running 'niak_kmeans_clustering'


Hi,

Running m-file 'niak_kmeans_clustering' , I get this error

 "Undefined function or method 'niak_part2mat'"

It seems there is no such a file  'niak_part2mat'



Can 'niak_kmeans_clustering' function be used for bisecting k-means?

In every phase of

bisecting k-means method, we gain two clusters that is OPTIMAL. (the

two centroids of two partitions update in an iteration until the clusters will be optimal.)

Does  'niak_kmeans_clustering' have the same result?



Thanks.







Apr 14, 2011  02:04 PM | Pierre Bellec
RE: Error in running 'niak_kmeans_clustering'
Dear Mohammad,

Sorry for the late reply, I needed to work a little bit for this issue. You are using a feature found in the development version, and I happened to forget a dependency from my private libraries. I have just commited the missing function to the repository. Please check out the latest version of the code on google code site :
http://code.google.com/p/niak/source/checkout

You will also need a version of PSOM :
http://www.nitrc.org/projects/psom

Here's an example of how to use bisecting k-means :

>> tseries = [(0.5+randn([100 20])) (1.5+randn([100 20])) (5+randn([100 20])) (6+randn([100 20]))];
>> opt_k.nb_classes = 4;
>> opt_k.flag_bisecting = true;
>> part = niak_kmeans_clustering(tseries,opt_k);
>> niak_visu_part(part);

Regarding your question on the algorithm, at each iteration the cluster with largest inertia (sum of squared Euclidian distance to the cluster mean) is splitted into two using standard k-means (well you can still decide how to initialize this k-means and how to deal with empty clusters). The main advantage of this approach is that it can be much faster than the standard k-means, it depends less on the initialization and will generally result into exactly the number of specified clusters (while clusters often disappear in the standard k-means when a large number of clusters are identified). Note that in my experience, a better initialization for k-means such as k-means++ [url=http://en.wikipedia.org/wiki/K-means%2B%2B]http://en.wikipedia.org/wiki/K-means%2B%2B[/url] (as opposed to a random partition) does also address these issues and can in some situations perform better than the bisecting k-means. You can try it out if you want (it is also implemented in the development version) :

>> tseries = [(0.5+randn([100 20])) (1.5+randn([100 20])) (5+randn([100 20])) (6+randn([100 20]))];

>> opt_k.nb_classes = 4;

>> opt_k.flag_bisecting = false;

>> opt_k.type_init = 'kmeans++';

>> part = niak_kmeans_clustering(tseries,opt_k);

>> niak_visu_part(part);


I hope this helps,

Pierre

Apr 20, 2011  10:04 AM | mohammad z
RE: Error in running 'niak_kmeans_clustering'
Thanks alot.