If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Count system total # of docs

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-19-10, 11:24
Bezman Bezman is offline
Registered User
 
Join Date: Jul 2010
Posts: 2
Count system total # of docs

Hi,

I have an application that stores documents. It stores the index data and links to the documents in DB2 tables.
I have now received the task to find our how many documents are in the system. As well as the expected growth per month.

It creates multiple tables for documents (for faster searches, it uses a "max rows per table" function and tables are scoped by date). Here is the end of "list tables":

...
WHA1 ROOT T 2008-06-17-09.33.33.258557
XAA10 ROOT T 2004-11-08-13.58.00.225807
XAA11 ROOT T 2005-03-08-12.10.39.250286
XAA12 ROOT T 2005-08-08-13.35.44.515158
XAA2 ROOT T 2000-07-14-17.58.03.662497
XAA3 ROOT T 2001-08-10-14.10.51.876224
XAA4 ROOT T 2002-07-10-12.15.12.900046
XAA5 ROOT T 2003-01-10-14.16.05.382125
XAA6 ROOT T 2003-06-10-18.42.33.149292
XAA7 ROOT T 2003-10-07-19.23.47.670011
XAA8 ROOT T 2004-02-10-06.29.00.232303
XAA9 ROOT T 2004-06-08-15.41.04.559518
XEA1 ROOT T 2001-10-02-12.21.25.050381
XGA1 ROOT T 2006-02-17-10.13.05.795665
XGA2 ROOT T 2006-02-24-18.19.08.562070
YDA1 ROOT T 2001-06-12-14.34.35.198750
YDA2 ROOT T 2004-06-15-04.30.15.024587
YEA1 ROOT T 2001-10-04-15.00.54.048835

207 record(s) selected.

Some 175-190 tables actually contain document records - these table names all end with one or more digits.

Now to get the total number of docs I could run "select count(*) from YEA1", "select count(*) from YDA2" etc some 190 times and add the results, which is described in an even more obnoxious way here:

Db2 Query to find the sum of record count of three tables

but I want to know if there is some less "codey" way to do this.
I even tried to run a cursor but it has been so long since I ran much SQL so I failed at it.

What I need is something like this metaphrase:

SELECT myresult.tblnm, COUNT(*) FROM (select (<first column of> as tblnm LIST TABLES > myresult) WHERE <"Table/View" ends with at least one digit>)

I hope you understand what I mean.
(And no, there is NOTHING I can do to change the number of tables. It is
IBMs creation. )
Reply With Quote
  #2 (permalink)  
Old 07-19-10, 12:23
dav1mo dav1mo is offline
Registered User
 
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
you could run a query against the catalog that would create the SQL you need.

Something along the lines of

select 'select count(*) from ' || rtrim(creator) || '.' || rtrim(tbname) || ';'
from columns
where column = your_document_column

You would, also need to wrap that up with a select sum(of the counts) with all of those queries UNIONed together. It would essentially be your 190 or so counts being done, with a query wrapped around that gives you the sum of all of those counts in one query.
Dave
Reply With Quote
  #3 (permalink)  
Old 07-19-10, 13:55
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,195
Here is an example to produce a query from list_tables:

Code:
------------------------------ Commands Entered ------------------------------
WITH
/************************************************************
********** Start of sample data                    **********
************************************************************/
list_tables
(first_col , second_col , third_col , fourth_col) AS (
VALUES
  ('WHA1'  , 'ROOT' , 'T' , '2008-06-17-09.33.33.258557')
, ('XAA10' , 'ROOT' , 'T' , '2004-11-08-13.58.00.225807')
, ('XAA11' , 'ROOT' , 'T' , '2005-03-08-12.10.39.250286')
, ('XAA12' , 'ROOT' , 'T' , '2005-08-08-13.35.44.515158')
, ('XA12A' , 'ROOT' , 'T' , '2005-08-08-13.35.44.515158')
, ('XAA2'  , 'ROOT' , 'T' , '2000-07-14-17.58.03.662497')
, ('XAA3'  , 'ROOT' , 'T' , '2001-08-10-14.10.51.876224')
, ('XAA4'  , 'ROOT' , 'T' , '2002-07-10-12.15.12.900046')
, ('XAA5'  , 'ROOT' , 'T' , '2003-01-10-14.16.05.382125')
, ('XAA6'  , 'ROOT' , 'T' , '2003-06-10-18.42.33.149292')
, ('XAA4A' , 'ROOT' , 'T' , '2002-07-10-12.15.12.900046')
, ('XAA5B' , 'ROOT' , 'T' , '2003-01-10-14.16.05.382125')
, ('XAA6C' , 'ROOT' , 'T' , '2003-06-10-18.42.33.149292')
, ('XAA7'  , 'ROOT' , 'T' , '2003-10-07-19.23.47.670011')
, ('XAA8'  , 'ROOT' , 'T' , '2004-02-10-06.29.00.232303')
, ('XAA9'  , 'ROOT' , 'T' , '2004-06-08-15.41.04.559518')
, ('XEA1'  , 'ROOT' , 'T' , '2001-10-02-12.21.25.050381')
, ('XEAX'  , 'ROOT' , 'T' , '2001-10-02-12.21.25.050381')
, ('XGA1'  , 'ROOT' , 'T' , '2006-02-17-10.13.05.795665')
, ('XGAY'  , 'ROOT' , 'T' , '2006-02-17-10.13.05.795665')
, ('XGA2'  , 'ROOT' , 'T' , '2006-02-24-18.19.08.562070')
, ('XGAZ'  , 'ROOT' , 'T' , '2006-02-24-18.19.08.562070')
, ('YDA1'  , 'ROOT' , 'T' , '2001-06-12-14.34.35.198750')
, ('YDA2'  , 'ROOT' , 'T' , '2004-06-15-04.30.15.024587')
, ('YEA1'  , 'ROOT' , 'T' , '2001-10-04-15.00.54.048835')
)
/************************************************************
**********   End of sample data                    **********
************************************************************/
SELECT
       XMLCAST(
         XMLGROUP(
           line_text || x'0d0a' AS s
           ORDER BY n1 , n2
         )
         AS CLOB(40K)
       ) AS query_produced
  FROM (VALUES
          (1 , 0010 , 'SELECT CASE '                                   )
        , (1 , 0020 , '       WHEN GROUPING(table_name) = 1 THEN '     )
        , (1 , 0030 , '            ''*** All tables ***'' '            )
        , (1 , 0040 , '       ELSE table_name '                        )
        , (1 , 0050 , '       END  AS table_name '                     )
        , (1 , 0060 , '     , SUM(number_of_docs) AS number_of_docs '  )
        , (1 , 0070 , '  FROM ( '                                      )
        , (9 , 0010 , '       ) s (table_name , number_of_docs) '      )
        , (9 , 0020 , ' GROUP BY '                                     )
        , (9 , 0030 , '       ROLLUP(table_name) '                     )
        , (9 , 0040 , ' ORDER BY '                                     )
        , (9 , 0050 , '       NULLIF(table_name , ''*** All tables ***'') ')
        , (9 , 0070 , ';'                                              )
        UNION ALL
        SELECT 5 , k * 10 + m
             , CASE m
               WHEN 1 THEN
                    CASE k
                    WHEN 1 THEN
                         '        '
                    ELSE '        UNION ALL '
                    END
               WHEN 2 THEN
                    '        SELECT ''' || first_col || ''' '
               WHEN 3 THEN
                    '             , COUNT(*) '
               WHEN 4 THEN
                    '          FROM ' || first_col
               END
         FROM  (SELECT first_col
                     , ROW_NUMBER() OVER(ORDER BY first_col) AS k
                  FROM list_tables
                 WHERE RIGHT(first_col , 1)
                       IN ('0' , '1' , '2' , '3' , '4' , '5' , '6' , '7' , '8' , '9')
               ) q
         CROSS JOIN
               (VALUES 1 , 2 , 3 , 4) m(m)
       ) r(n1 , n2 , line_text)
 WHERE line_text <> ''
;
------------------------------------------------------------------------------

QUERY_PRODUCED                                                                
------------------------------------------------------------------------------
SELECT CASE 
       WHEN GROUPING(table_name) = 1 THEN 
            '*** All tables ***' 
       ELSE table_name 
       END  AS table_name 
     , SUM(number_of_docs) AS number_of_docs 
  FROM ( 
        SELECT 'WHA1' 
             , COUNT(*) 
          FROM WHA1
        UNION ALL 
        SELECT 'XAA10' 
             , COUNT(*) 
          FROM XAA10
...
...
        UNION ALL 
        SELECT 'YEA1' 
             , COUNT(*) 
          FROM YEA1
       ) s (table_name , number_of_docs) 
 GROUP BY 
       ROLLUP(table_name) 
 ORDER BY 
       NULLIF(table_name , '*** All tables ***') 
;


  1 record(s) selected.

Result of execution of the produced query would look like this:
Code:
TABLE_NAME         NUMBER_OF_DOCS
------------------ --------------
WHA1                        xxxxx
XAA10                       xxxxx
...

YEA1                        xxxxx
*** All tables ***        xxxxxxx


  nnn record(s) selected.

Last edited by tonkuma; 07-20-10 at 01:02. Reason: Removed table_name from ORDER BY clause in the generated query.
Reply With Quote
  #4 (permalink)  
Old 07-20-10, 03:27
nvk@vhv nvk@vhv is offline
Registered User
 
Join Date: Jan 2010
Posts: 294
Hi,

if you often update your statistics you could just add the Cards for the tables from syscat.tables. It's less accurate, but much faster.
Reply With Quote
  #5 (permalink)  
Old 07-20-10, 11:02
Bezman Bezman is offline
Registered User
 
Join Date: Jul 2010
Posts: 2
Hi all and thanks for your tips!

Shortly after I posted I found the

select tabname, card from syscat.table

query which is accurate enough for my needs.
(We run update statisctics at least once a week, I think maybe even nightly.)

Importing it to Excel made the adding easy for me too!
*is happy puppy*

Cheers!

(FYI, the # of docs was around 150 million)
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On