Results 1 to 8 of 8
  1. #1
    Join Date
    Aug 2011
    Posts
    4

    Unanswered: Partition and Count

    Hi,

    I'm trying get the partition function count or sum the values for specific range of times. It's working ok but only on not duplicated data.

    Code:
    select 
      partition(hour([time]),0,23,1)
    , count(Visit) 
    , count(Qty)
    from orders
    group by partition(hour([time]),0,23
    The visits column are including duplicated information. Can you give me some advice or help with below?

    Thank you very much.
    Attached Files Attached Files

  2. #2
    Join Date
    Oct 2003
    Location
    Canada
    Posts
    574
    I looked at your db. It works perfectly. Where are the duplicated data that are causing you trouble? The visit column has two entries for A127, but the visit dates are different as are the quantities. Are you suggesting that those should not both be counted and totalled? If so, how will the database know which one to exclude? If you do the count and sums manually, how would you determine which to exclude?

    Given the data, the following looks right to me:

    V T Range_Time
    1 4 6: 6
    1 9 7: 7
    2 4 11:11
    2 8 13:13
    1 1 16:16
    John M Reynolds

  3. #3
    Join Date
    Aug 2011
    Posts
    4
    Quote Originally Posted by jmrSudbury View Post
    I looked at your db. It works perfectly. Where are the duplicated data that are causing you trouble? The visit column has two entries for A127, but the visit dates are different as are the quantities. Are you suggesting that those should not both be counted and totalled? If so, how will the database know which one to exclude? If you do the count and sums manually, how would you determine which to exclude?

    Given the data, the following looks right to me:

    V T Range_Time
    1 4 6: 6
    1 9 7: 7
    2 4 11:11
    2 8 13:13
    1 1 16:16
    Hm, what i did try to exclude duplicated row's for visits.

    For 11:11 it should be only 1 instead 2 as it was this same visit id.
    In example person visting website and is making an order for 2 products.
    Now i need count how many unique visits where made in specific range time.

    We should get something like below:

    V T Range_Time
    1 4 6: 6
    1 9 7: 7
    1 4 11:11
    2 8 13:13
    1 1 16:16

    I hope that it make sense...

    Many thanks,

  4. #4
    Join Date
    Oct 2003
    Location
    Canada
    Posts
    574
    So...

    Is it normal for two visits that occurred on two separate days to end up with the same visit ID number? How do you know which one to count and which to ignore?
    John M Reynolds

  5. #5
    Join Date
    Aug 2011
    Posts
    4
    Quote Originally Posted by jmrSudbury View Post
    So...

    Is it normal for two visits that occurred on two separate days to end up with the same visit ID number? How do you know which one to count and which to ignore?
    It's my fault as it should be this same date for this same visit.

  6. #6
    Join Date
    Oct 2003
    Location
    Canada
    Posts
    574
    Ok. So the data error propagated through the query. The best thing to do is to prevent erroneous data from being entered. The next best thing is to run a query that will show you the bad data, so it can be fixed.

    As long as you have bad initial data, you will have bad final data. If you insist on not fixing the bad data problem, then you can also make a query that will generate the dataset without the extra records. Usually, you would have to use the First function by visit number. Then use this to feed your original query and the counts will then be accurate.

    Now what about the Total? Should it not be 4? You said that the 2 count should only be a 1, but what about the Sum? If it is correct, then you are going to have to run the count separately from the sum then joint the two.

    Usually, the easiest way is to just fix the poor data problem.
    John M Reynolds

  7. #7
    Join Date
    Aug 2011
    Posts
    4
    Quote Originally Posted by jmrSudbury View Post
    Ok. So the data error propagated through the query. The best thing to do is to prevent erroneous data from being entered. The next best thing is to run a query that will show you the bad data, so it can be fixed.

    As long as you have bad initial data, you will have bad final data. If you insist on not fixing the bad data problem, then you can also make a query that will generate the dataset without the extra records. Usually, you would have to use the First function by visit number. Then use this to feed your original query and the counts will then be accurate.

    Now what about the Total? Should it not be 4? You said that the 2 count should only be a 1, but what about the Sum? If it is correct, then you are going to have to run the count separately from the sum then joint the two.

    Usually, the easiest way is to just fix the poor data problem.
    My fault with sample database as it should be this same date. The data in my main database is fine and it showing different this same date for this same visit number.

    I did try do count and next join two tables but it wasn't working at all.
    Will be appreciate for some kind hint or example that i can use...

  8. #8
    Join Date
    Oct 2003
    Location
    Canada
    Posts
    574
    Alright. You have totally lost me. I thought your example database had real data. In your main database, do you have two records with the same visit ID?

    Your Orders table has these fields:
    [ID], Visit, Product, Qty, Date_Visit, Time

    Where [ID] is the primary key as denoted by my square brackets.

    Here are the data:
    1 A123 123 3 18/05/2011 1:34:13 PM
    2 A124 143 1 03/04/2011 4:05:13 PM
    3 A125 193 9 06/07/2011 7:23:19 AM
    4 A126 137 4 13/07/2011 6:59:01 AM
    5 A127 139 3 05/08/2011 11:06:12 AM
    6 A127 129 1 11/04/2011 11:06:12 AM
    7 A129 121 5 09/08/2011 1:59:01 PM

    Perhaps the problem is you have an order table that requires duplicate Visit ID data. The issue is that you need to have a sub table that shows the Product and Quantity fields.

    I would have an order table like this:
    [Visit], Date_Visit, Time

    A123 18/05/2011 1:34:13 PM
    A124 03/04/2011 4:05:13 PM
    A125 06/07/2011 7:23:19 AM
    A126 13/07/2011 6:59:01 AM
    A127 05/08/2011 11:06:12 AM
    A129 09/08/2011 1:59:01 PM


    I would also have an OrderDetail table like this:
    [ID], Visit, Product, Qty

    1 A123 123 3
    2 A124 143 1
    3 A125 193 9
    4 A126 137 4
    5 A127 139 3
    6 A127 129 1
    7 A129 121 5

    Then, when you run your query on the main table for counting orders, you will get the correct number.
    John M Reynolds

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •