Results 1 to 7 of 7
  1. #1
    Join Date
    Jun 2011
    Posts
    3

    Unanswered: Select rows based on total percentage of records present

    Hi Guys,

    I am clueless on how to do this. Here is what i want: Lets say i have a table with 3 columns

    Id | Name | campaignid
    -----------------------------
    1. | John. | 1
    ----------------------------
    2 | Peter. | 2
    ---------------------------
    3. | Martin. | 3
    ----------------------------
    4. | Sarah. | 2
    ----------------------------
    5. | Jessica. | 3
    ----------------------------
    6. | Parker. | 3

    Now in reality i will have a million of records with approximately 1000 different campaignid, at a time i can only process 3000 records, so i want to select 3000 records and process them, then get another 3000 records and process them, so on and so forth. When i select 3000 records from the database I want to get the rows based on the percentage of campaignids present in total records. For ex in the above example, campaignid 3 is present 50%, 2 is 33.33% and 1 is 17.77%. Now if this was also the compisition of a million of records and i had to select 3000 records out of it, then campaignid 3 should be 1500 times, campaign id 2 should be 999 times and remaining should be campaign id 1, i.e. 50%, 33.3% and 17% respectively.

    Please could anyone help me out with an efficient query for this.

  2. #2
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Your business requirements are simply ridiculous. I strongly encourage you to rethink them.
    Why would your processing care whether the percentage in the subsets matches the percentages in the entire set?
    It is, of course, possible to write SQL code to do exactly what you want. But it would be horrendously complex, time-consuming, and fragile.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  3. #3
    Join Date
    Mar 2007
    Location
    Holmestrand, Norway
    Posts
    332
    Where's the like button?
    Ole Kristian Velstadbråten Bangås - Virinco - MSSQL.no - Facebook - Twitter

  4. #4
    Join Date
    Jun 2011
    Posts
    3
    Hi,

    I understand what you mean, but we've got to send this data to an external application which sends SMS and they can only take some data a time. Since we would have several campaigns we want to give each campaign owner a feel that there campaign is running and not waiting for some another campaign to finish.

    Another approach is to just send the data one by one and eventually every campaign would be served. But that could make the campaign owner very nervous about it.

    Another alternative would be to maintain a round robin queue of campaigns in my program and serve them one by one, i.e. take the maximum records the external application can handle from one campaign and send it, then the next turn is for another campaign till eventually all the campaigns are processed.

    I hope you understand what I mean and I would really appreciate any alternatives to this.


    Regards,

    Zahid

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Quote Originally Posted by zahidmaqbool View Post
    Since we would have several campaigns we want to give each campaign owner a feel that there campaign is running and not waiting for some another campaign to finish.
    Then why wouldn't a random sample be sufficient? The ratio or each subset would be similar to the percentages of the entire data set, though not exact.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  6. #6
    Join Date
    Jun 2011
    Posts
    3
    Quote Originally Posted by blindman View Post
    Then why wouldn't a random sample be sufficient? The ratio or each subset would be similar to the percentages of the entire data set, though not exact.
    Yeah, random should be fine. Any ideas on how to get random rows? Also will there be any performance problems while trying to get random?

  7. #7
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    If data is entered randomly, then returning the data in the order of the ID should be random.
    You could also "select top 3000 * from YourTable order by NEWID()", which will give you a random sample.
    The more "randomness" you insist on, the more difficult it will be to code.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •