Page 1 of 2 12 LastLast
Results 1 to 15 of 19
  1. #1
    Join Date
    Apr 2003
    Posts
    64

    Question Unanswered: Listing keys not present in other tables - an SQL question

    Hello!

    We have a series of processes which maintain, several tables of trades. We need to be able to verify, that, at the end, all trades have been processed and each is listed in all of the tables.

    Given the same column name M_TRADENUMB in four table A, B, C, and D, how do I list the numbers, which are present in only some of the four tables -- with the minimum amount of queries.

    Thanks!
    If you ever back up Sybase, you want this backup-server plugin.

  2. #2
    Join Date
    May 2005
    Location
    South Africa
    Posts
    1,365
    Provided Answers: 1
    Do you have a base table that will always have the number? I am thinking of a left join

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by teterin
    Given the same column name M_TRADENUMB in four table A, B, C, and D, how do I list the numbers, which are present in only some of the four tables -- with the minimum amount of queries.
    will just one query do?
    Code:
    select M_TRADENUMB 
      from ( select M_TRADENUMB from A
             union all
             select M_TRADENUMB from B
             union all
             select M_TRADENUMB from C
             union all
             select M_TRADENUMB from D
           ) as datas
    group
        by M_TRADENUMB
    having count(*) < 4
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Apr 2003
    Posts
    64
    Quote Originally Posted by pdreyer
    Do you have a base table that will always have the number? I am thinking of a left join
    I think, there is a fifth table (let's call it Z), that could be used for that...
    If you ever back up Sybase, you want this backup-server plugin.

  5. #5
    Join Date
    Apr 2003
    Posts
    64
    Quote Originally Posted by r937
    Code:
    select M_TRADENUMB 
      from ( select M_TRADENUMB from A
             union all
    ...
    having count(*) < 4
    This looks beautiful -- but what is "union all"? Is that a Sybase thing, or a standard SQL thing?.. Thanks a lot!
    If you ever back up Sybase, you want this backup-server plugin.

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by teterin
    but what is "union all"? Is that a Sybase thing, or a standard SQL thing?
    yes it is standard sql, and i would be very surprised if it was not supported in sybase

    give it a try and see

    then, just for laughs, change the UNION ALLs to UNIONs and see what difference it makes to the results

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    Apr 2003
    Posts
    64
    Quote Originally Posted by r937
    then, just for laughs, change the UNION ALLs to UNIONs and see what difference it makes to the results
    Well, since I modified the query you suggested to also identify the tables, where each number is listed, there is no difference between UNION ALL and UNION

    The query below not only shows, which numbers are missing from some of the tables, but also, which tables they are present in:

    Code:
    select count(TradeNumber) Lines, TradeNumber, Source 'Present In:'
    from (
        select distinct M_TRADENUMB TradeNumber, "A1" Source
           from A1
        select distinct M_TRADENUMB TradeNumber, "A1" Source
           from A2
    ...
        select distinct M_TRADENUMB TradeNumber, "An" Source
           from An
    ) as MEOW
        group by TradeNumber
        having count(TradeNumber) <> n order by TradeNumber
    This works - and is quite fast too... Thanks.
    If you ever back up Sybase, you want this backup-server plugin.

  8. #8
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by teterin
    Well, since I modified the query you suggested to also identify the tables, where each number is listed, there is no difference between UNION ALL and UNION
    that is correct, they return the same result sets

    however, UNION is a low slower, because it has to take every pair of rows and compare them column by column in order to eliminate completely duplicate rows...

    ... of which there won't be any, since each subselect in the union has a different constant on its rows

    therefore the sort of the complete result set required to detect duplicates is wasted cycles

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  9. #9
    Join Date
    May 2005
    Location
    South Africa
    Posts
    1,365
    Provided Answers: 1
    This works - and is quite fast too...
    Note: You are using non standard sql
    i.e. Your select list that includes aggregates have columns that are not arguments of aggregate functions and are not included in the group by clause

    Make sure you understand the result from these queries:
    Code:
    select id, type, count(id) cnt
    from 
    (select 1,1 union all
     select 2,1 union all
     select 3,2 union all 
     select 4,2 union all 
     select 5,2 union all 
     select 6,2 
    ) v (id,type)
    where id>=4
    group by type
    having count(*)=3
    
    id          type        cnt         
    ----------- ----------- ----------- 
              3           2           3 
              4           2           3 
              5           2           3 
              6           2           3 
    
    select id, type, count(id) cnt
    from 
    (select 1,1 union all
     select 2,1 union all
     select 3,2 union all 
     select 4,2 union all 
     select 5,2 union all 
     select 6,2 
    ) v (id,type)
    group by type
    having id>4
      and count(*)=4
    
    id          type        cnt         
    ----------- ----------- ----------- 
              5           2           4 
              6           2           4
    Here is another option using a left join
    Code:
    -- generate some test data
    select * into #a from sybsystemprocs..sysobjects where id>50
    select * into #b from sybsystemprocs..sysobjects where id%7 !=0
    select * into #c from sybsystemprocs..sysobjects where id%10!=0
    select * into #d from sybsystemprocs..sysobjects where id%100!=0
    create unique index ix1 on #a(id)
    create unique index ix1 on #b(id)
    create unique index ix1 on #c(id)
    create unique index ix1 on #d(id)
    go
    
    -- Now the left join 
    select o.id 
    ,case when a.id is null then 'N' else 'Y' end inA
    ,case when b.id is null then 'N' else 'Y' end inB
    ,case when c.id is null then 'N' else 'Y' end inC
    ,case when d.id is null then 'N' else 'Y' end inD
    from sybsystemprocs..sysobjects o
    left join #a a on a.id=o.id
    left join #b b on b.id=o.id
    left join #c c on c.id=o.id
    left join #d d on d.id=o.id
    where  case when a.id is null then 0 else 1 end
         +case when b.id is null then 0 else 1 end
         +case when c.id is null then 0 else 1 end
         +case when d.id is null then 0 else 1 end < 4
    order by o.id
    go
    
    Table: sysobjects scan count 1, logical reads: (regular=23 apf=0 total=23), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #a___________00002180016498795 scan count 601, logical reads: (regular=57 apf=0 total=57), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #b___________00002180016498795 scan count 601, logical reads: (regular=9 apf=0 total=9), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #c___________00002180016498795 scan count 601, logical reads: (regular=9 apf=0 total=9), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #d___________00002180016498795 scan count 601, logical reads: (regular=9 apf=0 total=9), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Total writes for this command: 0 
    Execution Time 0. 
    SQL Server cpu time: 0 ms.  SQL Server elapsed time: 13 ms. 
    
    --and a modified version from r937's example
    select id, max(A) inA,max(B) inB, max(C)inC, max(D)inD 
    from 
    (select id,'Y' A,'N' B, 'N' C, 'N' D from #a union all
     select id,'N' A,'Y' B, 'N' C, 'N' D from #b union all
     select id,'N' A,'N' B, 'Y' C, 'N' D from #c union all
     select id,'N' A,'N' B, 'N' C, 'Y' D from #d 
    ) t
    group by id
    having count(*)<4
    order by id
    go
    
    Table: #a___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #b___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #c___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #d___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: Worktable2  scan count 1, logical reads: (regular=1339 apf=0 total=1339), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: Worktable1  scan count 1, logical reads: (regular=2263 apf=0 total=2263), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Total writes for this command: 0 
     
    Execution Time 0. 
    SQL Server cpu time: 0 ms.  SQL Server elapsed time: 16 ms.
    
    -- And the non standard query
    select id, tid
    from 
    (select id,'A' from #a union all
     select id,'B' from #b union all
     select id,'C' from #c union all
     select id,'D' from #d 
    ) t(id,tid)
    group by id
    having count(*)<4
    order by id
    go
    
    Table: #a___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #b___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #c___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #d___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #a___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #b___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #c___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: #d___________00002180016498795 scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: Worktable3  scan count 2227, logical reads: (regular=5726 apf=0 total=5726), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: Worktable1  scan count 1, logical reads: (regular=2257 apf=0 total=2257), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: Worktable2  scan count 1, logical reads: (regular=2257 apf=0 total=2257), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Table: Worktable4  scan count 0, logical reads: (regular=466 apf=0 total=466), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 
    Total writes for this command: 0 
     
    Execution Time 1. 
    SQL Server cpu time: 100 ms.  SQL Server elapsed time: 36 ms.
    
    
    drop table #a
    drop table #b
    drop table #c
    drop table #d

  10. #10
    Join Date
    Apr 2003
    Posts
    64
    It turns out, there is not really a "master" table, that could be an authoritative source of the IDs. We really need to catch any and all, that are listed in some of the tables, but not in all -- a left join, AFAIU, will treat the first (left) table as the "authoritative". The creation and indexing of the temporary tables also makes things quite complicated and a bit harder to maintain, in case we need to add another table to the mix in the future.

    A standard SQL equivalent of what I posted would be greatly appreciated! Thank you very much for the examples and the analysis. Yours,

    -mi
    If you ever back up Sybase, you want this backup-server plugin.

  11. #11
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by teterin
    A standard SQL equivalent of what I posted would be greatly appreciated!
    you un-standardized my solution by adding Source to the SELECT but not the GROUP BY

    the purpose of my GROUP BY was to give you exactly what you asked for -- the numbers which were not present in all 4

    you seem to want also to know which sources they were in, and that's a different query

    note that you were getting only one of the sources, which could have been any of one, two, or three sources

    that's a mysql feature, by the way -- not a bug -- and if anyone is interested, i'll post a really good link on the subject
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  12. #12
    Join Date
    Apr 2003
    Posts
    64
    Quote Originally Posted by r937
    note that you were getting only one of the sources, which could have been any of one, two, or three sources
    Uhm, I'm pretty sure, we get the list of all tables, where an ID is listed, if it is not listed in some others.

    Quote Originally Posted by r937
    that's a mysql feature, by the way -- not a bug -- and if anyone is interested, i'll post a really good link on the subject
    Maybe, we are talking about different things here? We are using Sybase, not MySQL...
    If you ever back up Sybase, you want this backup-server plugin.

  13. #13
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    well, whaddya know, i thought mysql was the only database that runs invalid syntax, and i guess i forgot which forum we're in

    so are you saying that sybase actually runs the query exactly as you have it there in post #7????
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  14. #14
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by teterin
    A standard SQL equivalent of what I posted would be greatly appreciated!
    Code:
    select count(TradeNumber) Lines
         , TradeNumber
         , sum(Source) 'Present In:'
      from (
           select M_TRADENUMB TradeNumber, 2**0 as Source
             from A1
           union all select M_TRADENUMB, 2**1 
             from A2
           ...
           union all select M_TRADENUMB, 2**n 
             from An
           ) as MEOW
    group 
        by TradeNumber
    having count(TradeNumber) <> n 
    order 
        by TradeNumber
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  15. #15
    Join Date
    Apr 2003
    Posts
    64
    Well, here is the full actual query (only the table-names and the column-names altered to protect privacy somewhat):

    Code:
    select count(TradeNumber) Lines, TradeNumber, Source 'Present In:', "FAILED" Verdict
    from (
    	select distinct TRADENUMB TradeNumber, "R" Source
    		from R
    		where Z_DAYOFRUN = 'FIRST RUN'
    		having MAX(REF_JOB)=REF_JOB
    union all
    	select distinct TRADENUMB TradeNumber, "P" Source
    		from P
    		where Z_DAYOFRUN = 'FIRST RUN'
    		having MAX(REF_JOB)=REF_JOB
    union all
    	select distinct TRADENUMB TradeNumber, "C" Source
    		from C
    		where Z_DAYOFRUN = 'FIRST RUN'
    		having MAX(REF_JOB)=REF_JOB
    union all
    	select TRADENUMB TradeNumber, "TRADE" Source
    		from TRADE
    		where Z_DAYOFRUN = 'FIRST RUN' AND Z_STATUS <> 'DEAD'
    		having REF_JOB=MAX(REF_JOB) and Z_STATUS <> 'DEAD'
    union all
    	select NB TradeNumber, "PNL" Source
    		from PNL
    		where STATUS2 <> 'DEAD' AND PFOLIO = 'Woof'
    ) as MEOW
    	group by TradeNumber
    	having count(TradeNumber) <> 5 order by TradeNumber
    The first three queries are identical -- just on different tables, but the fourth and the fifth are different. The fifth even works on differently named columns, but they all are supposed to produce the exact same sets of IDs, and if they don't, we want to know about it...

    Now, I'm quite certain, this could be embelished and brought to the level of standard SQL -- could anyone, please, help? Thank you, all!
    If you ever back up Sybase, you want this backup-server plugin.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •