Results 1 to 9 of 9
  1. #1
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367

    Unanswered: remaining PCTFREE for index

    Do you know if there is a way to check the remaining %free value for an index? Not what PCTFREE is set to, but what % of free space is left (average for all index pages). For example, I specified PCTFREE 25 when creating an index. Then, I started inserting data... and later on what to check what %free is remaining?

    I guess I should add that I did the initial load with index.PCTFREE=25 and then applications started adding (inserting) data.
    Last edited by db2girl; 08-11-11 at 02:26.

  2. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Did you try REORGCHK? Formula F5?

  3. #3
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    I don't believe reorgchk.F5 gives me what I need. I created an index with pctfree=25. Here are the results after loading the table/performing runstats and then importing more rows/runstats.

    I tried db2pd -tcbstats index, inspect utility. Next is to try dumping it with db2dart, but I was hoping there is something that is easy to interpret.
    Attached Files Attached Files

  4. #4
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    What is the purpose of this? If you are only inserting data, the "average" leaf page utilization will always hover around (100-PCTFREE), which is to be expected. You start with an empty index page, fill it up to 75% (in your case - PCTFREE=25), then it splits, you end up with two pages, each 37.5% full, etc. With the large number of pages the average will be very close to 75%. No?

  5. #5
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    Quote Originally Posted by n_i View Post
    What is the purpose of this?
    Good question


    Quote Originally Posted by n_i View Post
    If you are only inserting data, the "average" leaf page utilization will always hover around (100-PCTFREE), which is to be expected. You start with an empty index page, fill it up to 75% (in your case - PCTFREE=25), then it splits, you end up with two pages, each 37.5% full, etc. With the large number of pages the average will be very close to 75%. No?
    But PCTFREE is only maintained during load/reorg. Inserts will fill it up to 100%

  6. #6
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by db2girl View Post
    But PCTFREE is only maintained during load/reorg. Inserts will fill it up to 100%
    I was curious if that was the case, so I did a little test.

    Code:
    >db2 create table test (f1 varchar(200))
    DB20000I  The SQL command completed successfully.
    
    >db2 create index testx1 on test(f1) pctfree 50
    DB20000I  The SQL command completed successfully.
    
    >db2 runstats on table schema.test and indexes all
    DB20000I  The RUNSTATS command completed successfully.
    
    >db2 select nleaf from syscat.indexes where indname='TESTX1'
    
    NLEAF
    --------------------
                       1
    
      1 record(s) selected.
    So now I have an empty table and an index with a key about 200 bytes long. The index sits in an 8K tablespace:
    Code:
    >db2 select pagesize from syscat.indexes i, syscat.tablespaces t where in
    dname = 'TESTX1' and t.tbspaceid = i.tbspaceid
    
    PAGESIZE
    -----------
           8192
    
      1 record(s) selected.
    I'm thinking that, given the PCTFREE of 50% and the index page overhead, inserting about 40 keys of 200 bytes each should cause a page split.
    Code:
    >db2 "insert into test (f1) with t (s, lvl) as (select repeat('a',180)||'
    1',1 from sysibm.sysdummy1 union all select repeat('a',180)||char(lvl+1), lvl+1 from sysibm.sysdummy1,t where lvl < 20)
    select s from t"
    DB20000I  The SQL command completed successfully.
    
    >db2 "insert into test (f1) with t (s, lvl) as (select repeat('b',180)||'
    1',1 from sysibm.sysdummy1 union all select repeat('c',180)||char(lvl+1), lvl+1 from sysibm.sysdummy1,t where lvl < 20)
    select s from t"
    DB20000I  The SQL command completed successfully.
    I do it in two shots to make key values somewhat different. Let's see what happened.
    Code:
    >db2 runstats on table schema.test and indexes all
    DB20000I  The RUNSTATS command completed successfully.
    
    >db2 select nleaf from syscat.indexes where indname='TESTX1'
    
    NLEAF
    --------------------
                       2
    
      1 record(s) selected.
    I now have two pages, as expected, one about 50% full, the other almost empty.

    However, what REORGCHK shows as the result of F5, remains a mystery.
    Code:
    >db2 reorgchk current statistics on table schema.test
    
    Table statistics:
    
    F1: 100 * OVERFLOW / CARD < 5
    F2: 100 * (Effective Space Utilization of Data Pages) > 70
    F3: 100 * (Required Pages / Total Pages) > 80
    
    SCHEMA.NAME                     CARD     OV     NP     FP ACTBLK    TSIZE  F1  F2  F3 REORG
    ----------------------------------------------------------------------------------------
    Table: schema.TEST
                                      40      0      2      2      -     8200   0 100 100 ---
    ----------------------------------------------------------------------------------------
    
    Index statistics:
    
    F4: CLUSTERRATIO or normalized CLUSTERFACTOR > 80
    F5: 100 * (Space used on leaf pages / Space available on non-empty leaf pages) > MIN(50, (100 - PCTFREE))
    F6: (100 - PCTFREE) * (Amount of space available in an index with one less level / Amount of space required for all keys) < 100
    F7: 100 * (Number of pseudo-deleted RIDs / Total number of RIDs) < 20
    F8: 100 * (Number of pseudo-empty leaf pages / Total number of leaf pages) < 20
    
    SCHEMA.NAME                 INDCARD  LEAF ELEAF LVLS  NDEL    KEYS LEAF_RECSIZE NLEAF_RECSIZE LEAF_PAGE_OVERHEAD NLEAF_PAGE_OVERHEAD  PCT_PAGES_SAVED  F4  F5  F6  F7  F8 REORG
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Table: schema.TEST
    Index: schema.TESTX1
                                     40     2     0    2     0      40          194           194               1568                1568                0 100 122   -   0   0 -----
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
    CLUSTERRATIO or normalized CLUSTERFACTOR (F4) will indicate REORG is necessary
    for indexes that are not in the same sequence as the base table. When multiple
    indexes are defined on a table, one or more indexes may be flagged as needing
    REORG.  Specify the most important index for REORG sequencing.
    
    Tables defined using the ORGANIZE BY clause and the corresponding dimension
    indexes have a '*' suffix to their names. The cardinality of a dimension index
    is equal to the Active blocks statistic of the table.

  7. #7
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    Thanks, I'll try your example tomorrow.


    I generated my load file using:

    ...
    fp=fopen("foo.unl1","w");
    for (i=1;i<9000000;i++)
    fprintf(fp, "%d,string00%d\n",i,i);



    It looks like:

    1,string001
    2,string002
    3,string003
    4,string004
    ...


    The first column (int) is my non-unique index. Then, I did tail tail -n 1000000 on it and used it to import (pctfree remained unchanged for most pages and some had to be split, I suppose). What I will try tomorrow is to leave some gaps when generating my load file and then import the new ones within existing values. I'm not sure if it will make any difference but will give it a try.

  8. #8
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    db2girl, I am not sure your original question can have a useful answer. The PCTFREE value only applies to the entire Index when it is defined and/or after a Load Replace or Reorg.

    At that point every Page should have up to 75% used (leaving 25% free). Once you start to Insert rows, various pages will fill up. After that, some pages will still have 25% free (no rows were inserted with Keys on those pages, while other pages can range from 50% (page split) up to about 0%.

    How many rows will stay at 25% and how many will fill up / split depends on the distribution of the key data that is Inserted. If you have a 'hot spot' where most of the data is added, very few of the pages will change and just the pages around the hot spot will be affected (with a lot of split pages).

  9. #9
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    Thank you, Stealth_DBA.

    Actually, the person who asked me this question wanted to have a way of determining the available free space on every leaf page. I don't know why they care so much about every leaf page...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •