Results 1 to 9 of 9
  1. #1
    Join Date
    Jan 2013
    Posts
    11

    Unanswered: DB2 GROUP BY problem

    Hi,

    Im trying to perform a group by action on this query:

    select
    task.task_id, task.subject,
    varchar_format(task.due_date + {3} SECOND,''MM/DD/YYYY''),
    forUsr.full_name,forUsrGrp.display_name as group_name,
    priority.name,
    coalesce(
    (select 1
    from lsw_usr_grp_mem_xref ugm
    where
    ugm.group_id = forUsrGrp.group_id
    and ugm.user_id in ({0})
    FETCH FIRST 1 ROWS ONLY),
    coalesce((select 1 from sysibm.sysdummy1 where task.user_id in ({0})), 0)) as runnable,
    bd.bpd_instance_id, fromUsrGrp.display_name as group_name,
    bd.alias as Variable, bd.string_value as Valor, bi.instance_name
    from
    lsw_task task left outer join lsw_usr_xref forUsr on task.user_id = forUsr.user_id
    left outer join lsw_usr_grp_xref forUsrGrp on task.group_id = forUsrGrp.group_id
    left outer join lsw_usr_grp_xref fromUsrGrp on task.group_id = -fromUsrGrp.group_id
    left outer join lsw_process process on task.cached_process_version_id = process.version_id
    left outer join lsw_priority priority on task.priority_id = priority.priority_id
    left outer join lsw_bpd_instance_variables bd on bd.bpd_instance_id = task.bpd_instance_id
    left outer join lsw_bpd_instance bi on bd.bpd_instance_id = bi.bpd_instance_id
    where
    task.task_id in ({1})

    the problem is "runnable" field i can't figure how to group it, if I remove

    coalesce(
    (select 1
    from lsw_usr_grp_mem_xref ugm
    where
    ugm.group_id = forUsrGrp.group_id
    and ugm.user_id in ({0})
    FETCH FIRST 1 ROWS ONLY),
    coalesce((select 1 from sysibm.sysdummy1 where task.user_id in ({0})), 0)) as runnable,

    Im able to group the rest, how can I group "runnable" field?

    PD: Sorry for bad English.

  2. #2
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    An easy way might be put your query in a subquery and group by in outer select, like...
    Code:
    SELECT runnable
         , /* some aggregate functions */
     FROM  (/* put your query here */) AS s
     GROUP BY
           runnable

  3. #3
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Ok, seems like you did not provide us with the entire SQL, so a little difficult, but I think you are right with your assumption that the problem lies in this coalesce function where you are running the two queries. Why have those in your select statement? You aren't selecting anything from either of the statements, so it seems that you could perform an AND/OR condition with EXISTS in your where clause, this would then remove your grouping problem. plus the FETCH FIRST 1 ROW ONLY does not help your performance the way an EXISTS would.

    Something like:
    Code:
    select whatever_columns, sum(some_function)
    where (exists (select 1
    from lsw_usr_grp_mem_xref ugm
    where
    ugm.group_id = forUsrGrp.group_id
    and ugm.user_id in ({0})
    OR (exists (select 1 from sysibm.sysdummy1 where task.user_id in ({0})), 0)) as runnable,
    bd.bpd_instance_id, fromUsrGrp.display_name as group_name,
    bd.alias as Variable, bd.string_value as Valor, bi.instance_name
    from
    lsw_task task left outer join lsw_usr_xref forUsr on task.user_id = forUsr.user_id
    left outer join lsw_usr_grp_xref forUsrGrp on task.group_id = forUsrGrp.group_id
    left outer join lsw_usr_grp_xref fromUsrGrp on task.group_id = -fromUsrGrp.group_id
    left outer join lsw_process process on task.cached_process_version_id = process.version_id
    left outer join lsw_priority priority on task.priority_id = priority.priority_id
    left outer join lsw_bpd_instance_variables bd on bd.bpd_instance_id = task.bpd_instance_id
    left outer join lsw_bpd_instance bi on bd.bpd_instance_id = bi.bpd_instance_id
    where
    task.task_id in ({1})))
    and the_rest_of_the_SQL_that_you_did_not_providegroup by whatever_columns

  4. #4
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I thought that runnable might be replaced by the following expression.

    Code:
         , CASE
           WHEN EXISTS
                (SELECT 0
                  FROM  lsw_usr_grp_mem_xref ugm
                  WHERE ugm.group_id = forUsrGrp.group_id
                    AND ugm.user_id IN ({0})
                )
            OR  task.user_id IN ({0}) THEN
                1
           ELSE 0
           END  AS runnable

  5. #5
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I guessed that it might be easy to construct the query by the following way
    (and might perform well than original query).

    (1) in a inner most subquery, join task table and forUsrGrp table
    and select necessary columns from the two tables and calculate runnable.
    (2) perform group by in second level subquery.
    (3) join other tables in outmost select.


    How do you want group by(as dav1mo asked)?
    What final result do you want?
    Last edited by tonkuma; 01-02-14 at 15:23. Reason: Replace fromUsrGrp by forUsrGrp

  6. #6
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Quote Originally Posted by tonkuma View Post
    I guessed that it might be easy to construct the query by the following way
    (and might perform well than original query).

    (1) in a inner most subquery, join task table and forUsrGrp table
    and select necessary columns from the two tables and calculate runnable.
    (2) perform group by in second level subquery.
    (3) join other tables in outmost select.

    ...
    Sorry, I jumped to the conclusion too easy.
    I assumed that grouping were based on runnable(and maybe additional some columns of task table and forUsrGrp table).
    But, the assumption was not clear.

    So, the questions are essential.
    How do you want group by(as dav1mo asked)?
    What final result do you want?
    Last edited by tonkuma; 01-02-14 at 15:25. Reason: Replace fromUsrGrp by forUsrGrp

  7. #7
    Join Date
    Jan 2013
    Posts
    11
    As you can see in the next image Im getting duplicate rows because of the "bd.alias" and "bd.string_value" fields http://imageshack.com/a/img838/5719/1wa.png

    So I need to group the result by
    task.task_id, task.subject, task.due_date, forUsr.full_name,forUsrGrp.display_name,
    "runnable", bd.bpd_instance_id, fromUsrGrp.display_name
    In that way I'll be able to concatenate the multiple values from the bd.alias and bd.string_values fields in a single field.

    here is an example of what i want:

    select
    task.task_id, task.subject,varchar_format(task.due_date + {3} SECOND,''MM/DD/YYYY''),
    forUsr.full_name,forUsrGrp.display_name as group_name, priority.name,
    coalesce((select 1 from lsw_usr_grp_mem_xref ugm
    where ugm.group_id = forUsrGrp.group_id and ugm.user_id in ({0})
    FETCH FIRST 1 ROWS ONLY),
    coalesce((select 1 from sysibm.sysdummy1
    where task.user_id in ({0})), 0)) as runnable,
    bd.bpd_instance_id, fromUsrGrp.display_name as group_name, bi.instance_name,
    SUBSTR(xmlserialize(xmlagg(xmltext(CONCAT( '|',CASE bd.VARIABLE_TYPE
    WHEN 'String' THEN 'var: ' || bd.Alias || ',valor: ' || bd.String_value
    WHEN 'Boolean' THEN 'var: ' || bd.Alias || ',valor: ' || bd.Boolean_value
    WHEN 'Decimal' THEN 'var: ' || bd.Alias || ',valor: ' || bd.Dec_value
    WHEN 'Date' THEN 'var: ' || bd.Alias || ',valor: ' || bd.Date_value
    END))) as VARCHAR(1024)), 2)
    from
    lsw_task task
    left outer join lsw_usr_xref forUsr on task.user_id = forUsr.user_id
    left outer join lsw_usr_grp_xref forUsrGrp on task.group_id = forUsrGrp.group_id
    left outer join lsw_usr_grp_xref fromUsrGrp on task.group_id = -fromUsrGrp.group_id
    left outer join lsw_process process on task.cached_process_version_id = process.version_id
    left outer join lsw_priority priority on task.priority_id = priority.priority_id
    left outer join lsw_bpd_instance_variables bd on bd.bpd_instance_id = task.bpd_instance_id
    left outer join lsw_bpd_instance bi on bd.bpd_instance_id = bi.bpd_instance_id
    where
    task.task_id in ({1})
    group by task.task_id, task.subject, task.due_date,
    forUsr.full_name,forUsrGrp.display_name, priority.name,
    runnable, bd.bpd_instance_id, fromUsrGrp.display_name,
    bi.instance_name

    This query is part of the functionality of ibm bpm software v7.5 so, I can add more fields but i can't remove them because some functionality may get lost.

    PD: Again sorry for bad english

  8. #8
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    (1) Are bi.instance_name not duplicated?
    Looking your image, it might be not duplicated. Right?

    (2)
    left outer join lsw_bpd_instance_variables bd on bd.bpd_instance_id = task.bpd_instance_id
    left outer join lsw_bpd_instance bi on bd.bpd_instance_id = bi.bpd_instance_id
    This could be re-written like
    left outer join lsw_bpd_instance_variables bd on bd.bpd_instance_id = task.bpd_instance_id
    left outer join lsw_bpd_instance bi on task.bpd_instance_id = bi.bpd_instance_id
    It means bi table could be joined directly to task table(not through bd table).

  9. #9
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    If you are using DB2 9.7.x(I forgot which fixpack), you may want to use LISTAGG function.

    Considering (1) and (2) in my just before post and my another post,
    this query might be worth to try.
    Code:
    SELECT
           task.task_id
         , task.subject
         , VARCHAR_FORMAT(task.due_date + {3} SECOND , ''MM/DD/YYYY'') /* AS task_due_date */
         , forUsr.full_name
         , forUsrGrp.display_name as group_name
         , priority.name
         , CASE
           WHEN EXISTS
                (SELECT 0
                  FROM  lsw_usr_grp_mem_xref ugm
                  WHERE ugm.group_id = forUsrGrp.group_id
                    AND ugm.user_id in ({0})
                )
            OR  task.user_id in ({0}) THEN
                1
           ELSE 0
           END  AS runnable
         , bd.bpd_instance_id
         , fromUsrGrp.display_name AS group_name
         , (SELECT LISTAGG(
                      'var: ' || bd.Alias || ',valor: '
                   || CASE bd.variable_type
                      WHEN 'String'  THEN bd.String_value
                      WHEN 'Boolean' THEN bd.Boolean_value
                      WHEN 'Decimal' THEN bd.Dec_value
                      WHEN 'Date'    THEN bd.Date_value
                      END
                    , '|'
                   )
             FROM  lsw_bpd_instance_variables bd
             WHERE bd.bpd_instance_id = task.bpd_instance_id
           ) AS var_valor
         , bi.instance_name
     FROM
           lsw_task         task
     LEFT  OUTER JOIN
           lsw_usr_xref     forUsr
      ON   forUsr.user_id = task.user_id
     LEFT  OUTER JOIN
           lsw_usr_grp_xref forUsrGrp
      ON   forUsrGrp.group_id = task.group_id
     LEFT  OUTER JOIN
           lsw_usr_grp_xref fromUsrGrp
      ON   fromUsrGrp.group_id = -task.group_id
     LEFT  OUTER JOIN
           lsw_process      process
      ON   process.version_id = task.cached_process_version_id
     LEFT  OUTER JOIN
           lsw_priority     priority
      ON   priority.priority_id = task.priority_id
     LEFT  OUTER JOIN
           lsw_bpd_instance bi
      ON   bi.bpd_instance_id = task.bpd_instance_id
     WHERE
           task.task_id in ({1})
    Note1: I prefer to write column names in ON clause by this order.
    Code:
           lsw_task         task
     LEFT  OUTER JOIN
           lsw_usr_xref     forUsr
      ON   forUsr.user_id = task.user_id
    Note2: It might be better to change duplicated aliases group_name.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •