Results 1 to 2 of 2
  1. #1
    Join Date
    May 2011
    Posts
    5

    Missing dataset (k-nn)

    I am comparing three methods decision tree, Nave Bayes, and K-NN. I have a dataset with missing values and by using the Weka value replace tool I was able to replace the values.

    I ran two test one before replacing the missing value and one after replacing the missing value. Before replacing the value k-nn correctly classified 55% instances but after replacing missing value it classified 56% correct instances.

    I want wondering why this happens to k-nn but not the other methods of classification?

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I doubt you'll get a good response on this forum but I'll start the ball rolling:
    • If you only have small amounts of data (< few hundred) then small anomalies will look like large trends.
    • Is your Weka tool putting in correct replacement values?
    • Is it then correct to store these values in a database rather than just record that no value is known?
    • Assuming the 3 methods all use different strategies to classify items then isn't it fair to guess that they'll use your new artificial data in different ways?
    • Have you looked at your coursework notes?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •