This is my first time I write application for using database. I'm using Vc++ ver 6.0. I have the algorithm which I have to implement, but I don't know how to start. The Algorithm is about clustering categorical attributes using "CoolCat" which is the name of the agorithm. The algorithm has two steps :
1. Initialization :
finding a suitable set of clusters out of a sample
S, taken from the data set . We first find k most \dissimilar" records from the sample set by maximizing the minimum pairwise entropy of the
chosen points. We start by finding the two points ps1; ps2 that maximize E(ps1; ps2 ) and placing them in two separate clusters (C1; C2), marking the records .
2. Incremental Step:
1.Given an initial set of clusters C = C1;.... ;Ck
2.Bring points to memory from disk and for each point p do
3. For i = 1; ....; k
4. Place p in Ci and compute ( E(Ci))
where Ci denotes the clustering obtained by placing p in cluster Ci
5. Let j = argmini( ( E(Ci)))
7. Place p in Cj
8. Until all points have been placed in some cluster.
Now, The starting point for me is ambiguous . What is the first step should I have? .