I've implemented O-cluster algorithm and during doing some tests I've noticed strange results. I hope that someone had something to do with this alghoritm and is able to answer for my questions :
1. O-cluster uses data histograms. Are the bins' values computed in usual way for histograms :
number of points in bin range/(number of all points*bin width)
2. Is the equation for chi-square test correct ??
observed is the value of histogram valley and expected is the average of the histogram counts of the valley and the lower peak.
3. I compute bin width as 3.49*standard_deviation*number of points^(-1/3)
My problem is that I've never reached min statistical significance(3.843) for none of my data sets.It is possible that i have wrong data (maybe too small for o-cluster, too few dimensions) but some other clustering alghoritms were able to obtain some clusters from my data sets.
Thank you very much for any clues, ideas and places where i can find some additional help.