Results 1 to 5 of 5

Thread: Duplicate Data

  1. #1
    Join Date
    Oct 2010
    Posts
    19

    Unanswered: Duplicate Data

    Hi All,

    Currently i have the below query,after the where condition inline query, the data which i am getting is 993 and with these data when i select these value in where condition im getting around 1000 data .

    Brief, the data is getting repeated in ArchiveBilling.dbo.tempbillingconsolidationid table. All i want is to distinct the those data when i do the select condition. i want to get data of 993 instead of 1000.


    select consolidationid,billingbatchid from ArchiveBilling.dbo.tempbillingconsolidationid with (nolock)
    where consolidationid in (select consolidationid from tempdistinctbillingconsolidationid where consolidationid not in (select consolidationid from billingconsolidation))

  2. #2
    Join Date
    Apr 2012
    Posts
    213
    Hi,

    Are you sure that there is only one row in the table ArchiveBilling.dbo.tempbillingconsolidationid for each consolidationid?

  3. #3
    Join Date
    Jan 2013
    Posts
    354
    Provided Answers: 1
    Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect.

    This is minimal polite behavior on SQL forums.

    Sample data is also a good idea, along with clear specifications. It is very hard to debug code when you do not let us see it.

    The names imply that you are using staging tables to mimic a 1950's style "punch cards & mag tapes" batch processing, where things are done by passing data from tape to tape, collecting bad data at each pass, until it is put on the curretn Master Tape.

    In RDBMS, each transaction would have an identifier, but the consolidation would not. It wodl not be batched up.

  4. #4
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Quote Originally Posted by Celko View Post
    Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect.

    This is minimal polite behavior on SQL forums.
    On this forum, that is not the case.
    We, or at least some of us, don't want long posts filled with data that may be extraneous to the issue at hand.
    If we need additional information regarding the table structure, we'll request it.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    First, rewrite this query to eliminate those nested selects. They simply confuse the logic. I wonder if the statement wasn't generated by Crystal Reports or some other query tool?
    This should be logically equivalent to your query:

    Code:
    select	Instance1.consolidationid,
    	Instance1.billingbatchid
    from	ArchiveBilling.dbo.tempbillingconsolidationid as Instance1
    	inner join tempdistinctbillingconsolidationid as Instance2 on  Instance1.consolidationid = Instance2.consolidationid
    	left outer join billingconsolidation on Instance2.consolidationid = billingconsolidation.consolidationid
    where	billingconsolidation.consolidationid is null
    Now, it should be evident that the self join on tempbillingconsolidationid is superfluous (assuming both instances are in the same database and schema?).
    So, eliminate the unnecessary join:

    Code:
    select	tempbillingconsolidationid.consolidationid,
    	tempbillingconsolidationid.billingbatchid
    from	ArchiveBilling.dbo.tempbillingconsolidationid
    	left outer join billingconsolidation on tempbillingconsolidationid.consolidationid = billingconsolidation.consolidationid
    where	billingconsolidation.consolidationid is null
    Much simpler, eh?
    Now just throw a DISTINCT clause on it, to return only unique combinations of consolidationid and billingbatchid:
    Code:
    select	distinct
    	tempbillingconsolidationid.consolidationid,
    	tempbillingconsolidationid.billingbatchid
    from	ArchiveBilling.dbo.tempbillingconsolidationid
    	left outer join billingconsolidation on tempbillingconsolidationid.consolidationid = billingconsolidation.consolidationid
    where	billingconsolidation.consolidationid is null
    Does that give you what you want?
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •