Results 1 to 7 of 7
  1. #1
    Join Date
    Nov 2011
    Posts
    3

    Unanswered: Advice on table attributes

    I have a database in Microsoft Access 2010. One table contains test results for a series of pollutants. Historically the fields have been text fields because we need to account for a pollutant result that has no detection or if it wasn't test for. So the field will look like..

    Iron
    .025
    .03
    NT (not tested for)
    ND (not detected)

    I'm not sure how to run a query against somethign like this for reporting. I want to run a query for the min, mean and avg value for a certain pollutant. I thought of making NT a -1 and ND a -2 so I can change the field to a number. But I do need to count how many times a pollutant was tested for and not detected, the negative messes up my min, max and avg.

    does anyone have a suggestion on how to structure this table, should I have another column? This table contains 10 pollutants that are test for.

    Thank you

  2. #2
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    I don't know the business rules that apply to your database but it seems logical to me that Non-Detected should be 0 (zero) and Non-Tested should be Null. You would not have any problem for filtering the data: (= 0) and( Is Null)
    Have a nice day!

  3. #3
    Join Date
    Nov 2011
    Posts
    3
    I thought a 0 would be ok as well, but for the scientist that could be a pollutants value is 0 and that's not the case when they mean non detected. I'm not sure how to seperate the difference or should I?

    Thank you!!!

  4. #4
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    I agree that there is a very subtle difference between "We analysed the sample and pullutant xyz was not detected" and "We analysed the sample and the measured value for pollutant xyz was zero". However it should not make any difference as far as statistical computations are concerned.

    Anyway, scientist are always right (except when thet're wrong ). You can use some "impossible" values, such as as negative numbers, to indicate a "special" result (non-detected, not tested, not available, don't know, etc.); it just does not seems right to me.

    If you chose to do so, eliminating those values in a query should not be a problem: (WHERE Result >=0). I would then create a lookup table to associate these "special" numeric values with their meanings (or captions). This lookup table could be used by the interface (in a Combo for instance) to prevent the users from typing whatever they want in such cases and make a mess in the table.
    Have a nice day!

  5. #5
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    I think using negative values is not a smart call.
    why, because it doesn't make logical sense, someone needs to know why negative values are being used and what they mean. its fine for as long as you are around but for this sort for data it needs to be clear.
    Personally I'd use null for .not tested for', if a test was done then there should be a result. even ifd that result is zero

    it does make a difference especially if you use the domain aggregate functions
    I'd rather be riding on the Tiger 800 or the Norton

  6. #6
    Join Date
    Nov 2011
    Posts
    3
    I wish I could use 0 or the actual value but if the number is too low the lab does not send you the value it just has non-detected because their test couldn't detect it, which doesn't mean it's a value of 0.

    From reading the recommendations here I think I'm going to have to use negative numbers to correspond with the different codes and Null for those pollutants that were not test for.

    Thank you!

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    use two columns instead of one

    Iron_test CHAR(2)
    Iron_result DECIMAL(9,3)

    then Iron_test can be ND or NT and Iron_result will be NULL

    when Iron_test is DT (detected) then Iron_result will have a valid value

    then just run your means and maxes and so on on the Iron_result column, and all the NULLs will be ignored

    never conflate two different attributes into the same column, as you would if you mixed a code like NT with a value like 0.25 (which would also mean you'd need to do conversion from character to numeric in order to do means and maxes and so on)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •