Results 1 to 5 of 5
  1. #1
    Join Date
    Jul 2010
    Posts
    2

    Unanswered: Count system total # of docs

    Hi,

    I have an application that stores documents. It stores the index data and links to the documents in DB2 tables.
    I have now received the task to find our how many documents are in the system. As well as the expected growth per month.

    It creates multiple tables for documents (for faster searches, it uses a "max rows per table" function and tables are scoped by date). Here is the end of "list tables":

    ...
    WHA1 ROOT T 2008-06-17-09.33.33.258557
    XAA10 ROOT T 2004-11-08-13.58.00.225807
    XAA11 ROOT T 2005-03-08-12.10.39.250286
    XAA12 ROOT T 2005-08-08-13.35.44.515158
    XAA2 ROOT T 2000-07-14-17.58.03.662497
    XAA3 ROOT T 2001-08-10-14.10.51.876224
    XAA4 ROOT T 2002-07-10-12.15.12.900046
    XAA5 ROOT T 2003-01-10-14.16.05.382125
    XAA6 ROOT T 2003-06-10-18.42.33.149292
    XAA7 ROOT T 2003-10-07-19.23.47.670011
    XAA8 ROOT T 2004-02-10-06.29.00.232303
    XAA9 ROOT T 2004-06-08-15.41.04.559518
    XEA1 ROOT T 2001-10-02-12.21.25.050381
    XGA1 ROOT T 2006-02-17-10.13.05.795665
    XGA2 ROOT T 2006-02-24-18.19.08.562070
    YDA1 ROOT T 2001-06-12-14.34.35.198750
    YDA2 ROOT T 2004-06-15-04.30.15.024587
    YEA1 ROOT T 2001-10-04-15.00.54.048835

    207 record(s) selected.

    Some 175-190 tables actually contain document records - these table names all end with one or more digits.

    Now to get the total number of docs I could run "select count(*) from YEA1", "select count(*) from YDA2" etc some 190 times and add the results, which is described in an even more obnoxious way here:

    http://www.dbforums.com/db2/1645326-...ee-tables.html

    but I want to know if there is some less "codey" way to do this.
    I even tried to run a cursor but it has been so long since I ran much SQL so I failed at it.

    What I need is something like this metaphrase:

    SELECT myresult.tblnm, COUNT(*) FROM (select (<first column of> as tblnm LIST TABLES > myresult) WHERE <"Table/View" ends with at least one digit>)

    I hope you understand what I mean.
    (And no, there is NOTHING I can do to change the number of tables. It is
    IBMs creation. )

  2. #2
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    you could run a query against the catalog that would create the SQL you need.

    Something along the lines of

    select 'select count(*) from ' || rtrim(creator) || '.' || rtrim(tbname) || ';'
    from columns
    where column = your_document_column

    You would, also need to wrap that up with a select sum(of the counts) with all of those queries UNIONed together. It would essentially be your 190 or so counts being done, with a query wrapped around that gives you the sum of all of those counts in one query.
    Dave

  3. #3
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Here is an example to produce a query from list_tables:

    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
    /************************************************************
    ********** Start of sample data                    **********
    ************************************************************/
    list_tables
    (first_col , second_col , third_col , fourth_col) AS (
    VALUES
      ('WHA1'  , 'ROOT' , 'T' , '2008-06-17-09.33.33.258557')
    , ('XAA10' , 'ROOT' , 'T' , '2004-11-08-13.58.00.225807')
    , ('XAA11' , 'ROOT' , 'T' , '2005-03-08-12.10.39.250286')
    , ('XAA12' , 'ROOT' , 'T' , '2005-08-08-13.35.44.515158')
    , ('XA12A' , 'ROOT' , 'T' , '2005-08-08-13.35.44.515158')
    , ('XAA2'  , 'ROOT' , 'T' , '2000-07-14-17.58.03.662497')
    , ('XAA3'  , 'ROOT' , 'T' , '2001-08-10-14.10.51.876224')
    , ('XAA4'  , 'ROOT' , 'T' , '2002-07-10-12.15.12.900046')
    , ('XAA5'  , 'ROOT' , 'T' , '2003-01-10-14.16.05.382125')
    , ('XAA6'  , 'ROOT' , 'T' , '2003-06-10-18.42.33.149292')
    , ('XAA4A' , 'ROOT' , 'T' , '2002-07-10-12.15.12.900046')
    , ('XAA5B' , 'ROOT' , 'T' , '2003-01-10-14.16.05.382125')
    , ('XAA6C' , 'ROOT' , 'T' , '2003-06-10-18.42.33.149292')
    , ('XAA7'  , 'ROOT' , 'T' , '2003-10-07-19.23.47.670011')
    , ('XAA8'  , 'ROOT' , 'T' , '2004-02-10-06.29.00.232303')
    , ('XAA9'  , 'ROOT' , 'T' , '2004-06-08-15.41.04.559518')
    , ('XEA1'  , 'ROOT' , 'T' , '2001-10-02-12.21.25.050381')
    , ('XEAX'  , 'ROOT' , 'T' , '2001-10-02-12.21.25.050381')
    , ('XGA1'  , 'ROOT' , 'T' , '2006-02-17-10.13.05.795665')
    , ('XGAY'  , 'ROOT' , 'T' , '2006-02-17-10.13.05.795665')
    , ('XGA2'  , 'ROOT' , 'T' , '2006-02-24-18.19.08.562070')
    , ('XGAZ'  , 'ROOT' , 'T' , '2006-02-24-18.19.08.562070')
    , ('YDA1'  , 'ROOT' , 'T' , '2001-06-12-14.34.35.198750')
    , ('YDA2'  , 'ROOT' , 'T' , '2004-06-15-04.30.15.024587')
    , ('YEA1'  , 'ROOT' , 'T' , '2001-10-04-15.00.54.048835')
    )
    /************************************************************
    **********   End of sample data                    **********
    ************************************************************/
    SELECT
           XMLCAST(
             XMLGROUP(
               line_text || x'0d0a' AS s
               ORDER BY n1 , n2
             )
             AS CLOB(40K)
           ) AS query_produced
      FROM (VALUES
              (1 , 0010 , 'SELECT CASE '                                   )
            , (1 , 0020 , '       WHEN GROUPING(table_name) = 1 THEN '     )
            , (1 , 0030 , '            ''*** All tables ***'' '            )
            , (1 , 0040 , '       ELSE table_name '                        )
            , (1 , 0050 , '       END  AS table_name '                     )
            , (1 , 0060 , '     , SUM(number_of_docs) AS number_of_docs '  )
            , (1 , 0070 , '  FROM ( '                                      )
            , (9 , 0010 , '       ) s (table_name , number_of_docs) '      )
            , (9 , 0020 , ' GROUP BY '                                     )
            , (9 , 0030 , '       ROLLUP(table_name) '                     )
            , (9 , 0040 , ' ORDER BY '                                     )
            , (9 , 0050 , '       NULLIF(table_name , ''*** All tables ***'') ')
            , (9 , 0070 , ';'                                              )
            UNION ALL
            SELECT 5 , k * 10 + m
                 , CASE m
                   WHEN 1 THEN
                        CASE k
                        WHEN 1 THEN
                             '        '
                        ELSE '        UNION ALL '
                        END
                   WHEN 2 THEN
                        '        SELECT ''' || first_col || ''' '
                   WHEN 3 THEN
                        '             , COUNT(*) '
                   WHEN 4 THEN
                        '          FROM ' || first_col
                   END
             FROM  (SELECT first_col
                         , ROW_NUMBER() OVER(ORDER BY first_col) AS k
                      FROM list_tables
                     WHERE RIGHT(first_col , 1)
                           IN ('0' , '1' , '2' , '3' , '4' , '5' , '6' , '7' , '8' , '9')
                   ) q
             CROSS JOIN
                   (VALUES 1 , 2 , 3 , 4) m(m)
           ) r(n1 , n2 , line_text)
     WHERE line_text <> ''
    ;
    ------------------------------------------------------------------------------
    
    QUERY_PRODUCED                                                                
    ------------------------------------------------------------------------------
    SELECT CASE 
           WHEN GROUPING(table_name) = 1 THEN 
                '*** All tables ***' 
           ELSE table_name 
           END  AS table_name 
         , SUM(number_of_docs) AS number_of_docs 
      FROM ( 
            SELECT 'WHA1' 
                 , COUNT(*) 
              FROM WHA1
            UNION ALL 
            SELECT 'XAA10' 
                 , COUNT(*) 
              FROM XAA10
    ...
    ...
            UNION ALL 
            SELECT 'YEA1' 
                 , COUNT(*) 
              FROM YEA1
           ) s (table_name , number_of_docs) 
     GROUP BY 
           ROLLUP(table_name) 
     ORDER BY 
           NULLIF(table_name , '*** All tables ***') 
    ;
    
    
      1 record(s) selected.

    Result of execution of the produced query would look like this:
    Code:
    TABLE_NAME         NUMBER_OF_DOCS
    ------------------ --------------
    WHA1                        xxxxx
    XAA10                       xxxxx
    ...
    
    YEA1                        xxxxx
    *** All tables ***        xxxxxxx
    
    
      nnn record(s) selected.
    Last edited by tonkuma; 07-20-10 at 02:02. Reason: Removed table_name from ORDER BY clause in the generated query.

  4. #4
    Join Date
    Jan 2010
    Posts
    335
    Hi,

    if you often update your statistics you could just add the Cards for the tables from syscat.tables. It's less accurate, but much faster.

  5. #5
    Join Date
    Jul 2010
    Posts
    2
    Hi all and thanks for your tips!

    Shortly after I posted I found the

    select tabname, card from syscat.table

    query which is accurate enough for my needs.
    (We run update statisctics at least once a week, I think maybe even nightly.)

    Importing it to Excel made the adding easy for me too!
    *is happy puppy*

    Cheers!

    (FYI, the # of docs was around 150 million)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •