Results 1 to 5 of 5
  1. #1
    Join Date
    Apr 2009
    Posts
    3

    Unanswered: After combining tables, how to avoid sub queries?

    I start out with 2 mysql tables:

    create table thingA (dt date, val int);
    create table thingB (dt date, val int);

    populate them:

    insert into thingA values('2009-01-01',1);
    insert into thingA values('2009-01-02',2);
    insert into thingA values('2009-01-03',3);
    insert into thingB values('2009-01-01',40);
    insert into thingB values('2009-01-03',50);
    insert into thingB values('2009-01-04',60);

    Run my query:

    select
    thingA.dt,
    thingA.val as Val1,
    thingB.val as Val2
    from
    thingA
    left join thingB
    on thingA.dt=thingB.dt
    union
    select
    thingB.dt,
    thingA.val as Val1,
    thingB.val as Val2
    from
    thingB
    left join thingA
    on thingA.dt=thingB.dt
    order by dt;

    gives me the desired result:

    +------------+------+------+
    | dt | Val1 | Val2 |
    +------------+------+------+
    | 2009-01-01 | 1 | 40 |
    | 2009-01-02 | 2 | NULL |
    | 2009-01-03 | 3 | 50 |
    | 2009-01-04 | NULL | 60 |
    +------------+------+------+

    Now, I'm going to have a lot more 'things' and don't wantt a table for each, so I'm thinking of a single table instead:

    create table things (dt date, val int, thing varchar(10));

    insert into things values('2009-01-01',1,"A");
    insert into things values('2009-01-02',2,"A");
    insert into things values('2009-01-03',3,"A");
    insert into things values('2009-01-01',40,"B");
    insert into things values('2009-01-03',50,"B");
    insert into things values('2009-01-04',60,"B");

    and there will be data for other 'things' like:

    insert into things values('2009-01-01',100,"C");
    etc.

    I'm trying to rewrite my query to get the same result as above. The best I can do is:

    select
    distinct(dt) ,
    (select val from things where thing="A" and dt=t1.dt) as Val1 ,
    (select val from things where thing="B" and dt=t1.dt) as Val2
    from
    things as t1
    where
    thing in ("A","B")
    order by dt;

    Which gives the desired result:

    +------------+------+------+
    | dt | Val1 | Val2 |
    +------------+------+------+
    | 2009-01-01 | 1 | 40 |
    | 2009-01-02 | 2 | NULL |
    | 2009-01-03 | 3 | 50 |
    | 2009-01-04 | NULL | 60 |
    +------------+------+------+

    My concern is that this table is going to get pretty big (lots of 'things' and and thousands of dates for each thing), so I'm worried that the sub queries are really going to slow things down (even if I index dt & thing).

    How can I rewrite the query to avoid sub queries?

    Thanks for any help.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    compare this to your original query --
    Code:
    SELECT thingA.dt
         , thingA.val AS Val1
         , thingB.val AS Val2
      FROM things AS thingA
    LEFT OUTER
      JOIN things AS thingB
        ON thingB.dt = thingA.dt
       AND thingB.thing = 'B'
     WHERE thingA.thing = 'A'
    UNION
    SELECT thingB.dt
         , thingA.val AS Val1
         , thingB.val AS Val2
      FROM things AS thingB
    LEFT OUTER
      JOIN things AS thingA
        ON thingA.dt = thingB.dt
       AND thingA.thing = 'A'
     WHERE thingB.thing = 'B'
    ORDER 
        BY dt;
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Apr 2009
    Posts
    3
    Fantasticly quick reply.
    I'm in awe of your solution.
    Huge thanks.

  4. #4
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    ajvok, You indicate that you are using mysql. I have never used it or read any documentation so I don't know if you can use this or not. The only reason I mention it is you indicated you are starting with 2 things (A, B), you plan on a third (C) and say there will be 'lots' of things (10, 100, 1000?).

    The problem with the solution is you will have to UNION a "SELECT LEFT OUTER JOIN" for every thing you add to the table. Here is a possible solution as long as the syntax is valid in mysql. It works in DB2:

    Code:
    SELECT DT
    , MAX(CASE WHEN thing = 'A' THEN val ELSE NULL END) AS val1
    , MAX(CASE WHEN thing = 'B' THEN val ELSE NULL END) AS val2
    , MAX(CASE WHEN thing = 'C' THEN val ELSE NULL END) AS val3
    FROM THINGS
    GROUP BY DT
    ;
    As you add new a new 'thing', you just have to add 1 new 'MAX(CASE)' line for each one.
    Code:
    Table: THINGS
    Col1: dt DATE
    Col2: val INTEGER
    Col3: thing VARCHAR(10)
    Code:
    Data:
    DT         VAL         THING     
    ---------- ----------- ----------
    01/01/2009           1 A         
    01/02/2009           2 A         
    01/03/2009           3 A         
    01/03/2009          40 B         
    01/04/2009          50 B         
    01/01/2009          60 B         
    01/01/2009         700 C         
    01/04/2009         800 C         
    01/05/2009         900 C
    Code:
    Result of query:
    DT         VALA        VALB        VALC       
    ---------- ----------- ----------- -----------
    01/01/2009           1          60         700
    01/02/2009           2           -           -
    01/03/2009           3          40           -
    01/04/2009           -          50         800
    01/05/2009           -           -         900

  5. #5
    Join Date
    Apr 2009
    Posts
    3
    That does work in Mysql.
    Less code, and appears to be quicker.
    Very nice, thank you.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •