Results 1 to 11 of 11
  1. #1
    Join Date
    Jan 2008
    Posts
    45

    Unanswered: IN operator and =ANY

    can i use =ANY instead of IN for a subquery like


    select empid from emp where empid IN(select empis from dept);


    If so what is the difference between =ANY and IN??

  2. #2
    Join Date
    Jan 2003
    Posts
    4,292
    Provided Answers: 5
    They are basically the same.


    select empid from emp where empid IN(select empis from dept);

    is the same as

    select empid from emp where empid = ANY (select empis from dept);

    Andy

  3. #3
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Actually, there is a HUGE difference. Using the IN (non-correlated subquery) is indexable and a stage 1 predicate, whereas, using the = ANY (non-correlated subquery) is not indexable and is a stage 2 predicate. So the difference comes in with the CPU cost of executing the query and if the column being equated to exists in an index. Good luck.
    Dave Nance

  4. #4
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    there is a HUGE difference
    To be fair, that probably depends on the DB2 version, platform, the actual query plan, etc. In all cases I have seen both IN and =ANY predicates transform into joins, so I would hold my judgment until I have a chance to compare execution plans.
    ---
    "It does not work" is not a valid problem statement.

  5. #5
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Yes, they will both be transformed into joins, but the IN is indexable(if supporting index exists) and a stage 1(sargable) predicate, so it will always be the more efficient in terms of CPU cost. That is until a newer version is released that makes the = ANY into an indexable, stage 1 predicate, since with every release we get more indexable stage 1 predicates.

  6. #6
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Like I said, I have yet to see a difference in the execution plans with either of these predicates. Granted, I don't work much on z/OS so my experience is mostly with the "distributed systems", but whatever experience I have tells me that IN and =ANY seem to be quite equivalent.
    ---
    "It does not work" is not a valid problem statement.

  7. #7
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Quote Originally Posted by dav1mo
    Yes, they will both be transformed into joins, but the IN is indexable(if supporting index exists) and a stage 1(sargable) predicate, so it will always be the more efficient in terms of CPU cost. That is until a newer version is released that makes the = ANY into an indexable, stage 1 predicate, since with every release we get more indexable stage 1 predicates.
    It is not only that we get more stage 1 predicates, it is also that the query rewrite function is being improved so that DB2 rewrites the query before the access path is chosen so that two queries that look different may end up being the same.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  8. #8
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Making absolute statements what the DB2 optimizer will or will not do in general for a certain construct are usually always wrong. Here is an example showing that n_i is right and dav1mo is wrong:
    Code:
    $ db2 "create table t1 ( a int, b int )"
    $ db2 "create table t2 ( a int, b int )"
    $ db2 "create index i2 on t2(a)"
    $ db2 "begin atomic declare i int default 0; while ( i < 100000 ) do insert into t1 values (i, i * 10); set i = i + 1; end while; end"
    $ db2 "begin atomic declare i int default 0; while ( i < 1000000 ) do insert into t2 values (i, rand() * 1000000 ); set i = i + 1; end while; end"
    $ db2 runstats on table stolze.t1
    $ db2 runstats on table stolze.t2
    $ db2expln -d test -q "select * from t1 where b =any ( select a from t2 )" -t -g -i  
    $ db2expln -d test -q "select * from t1 where b in ( select a from t2 )" -t -g -i
    Both queries show the exact same access plan and both have the same costs.
    Code:
    Optimizer Plan:
    
           Rows
         Operator
           (ID)
           Cost
    
         100000
         RETURN
          ( 1)
         736.227
           |
         100000
         NLJOIN
          ( 2)
         736.227
        /       \-\
     100000        *
     TBSCAN       |
      ( 3)    1.84467e+19
     543.787  Index:
       |      STOLZE
     100000   I2
     Table:
     STOLZE
     T1
    Granted, this is DB2 LUW and not DB2 z/OS. But the OP didn't say anything about a specific DB2 version or platform, which makes absolute statements even more dubious. And even for DB2 z/OS I have serious doubts. (I'll double-check with the DB2 z/OS optimizer folks to get a reliable answer on that.)
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  9. #9
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    Quote Originally Posted by n_i
    ... whatever experience I have tells me that IN and =ANY seem to be quite equivalent.
    Not just that: they are equivalent.
    So, any optimizer (DB2 or not, z/OS or not, version 9 or not) should do a query rewrite from "= ANY" to "IN" (possibly followed by a query rewrite to a join, but that decision will be based on available stats for the inner query, while the first rewrite should just be a blind, rule-based decision).
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

  10. #10
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    I was mistaken yesterday for LUW, tried it out last night, plus stolze tried it as well. On Z/OS I get index access matching 1 col for the IN and a non-matching index scan for the =ANY. From reading through the manuals yesterday it appeared the result should have been comparable on LUW. Sorry for misleading you there.

    Dave Nance

  11. #11
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    I got word from the DB2 z/OS developers: =ANY is converted to an IN predicate. So both are equivalent.

    If you have any example that shows a different behavior, please let me know and I can forward this to the development team for further investigation.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •