Results 1 to 7 of 7
  1. #1
    Join Date
    May 2013
    Posts
    5

    Unanswered: How to organize dbspaces

    Hi guys,

    Which is the better approach to organize dbspaces?
    One dbspace with several chunks or several spaces with one chunks?

    Regards

  2. #2
    Join Date
    Sep 2011
    Location
    Pont l'Abbé, Brittany, France
    Posts
    381
    Provided Answers: 1
    HI bg,

    several dbspaces has strictly no interest. There is no point in logically separating data if they sit on the same spindle.

    According to the size of the data you want to handle in the dbspace, a rule of thumb is to keep chunks not larger than 20Gb, to provide ease of handling and maintenance.

    Some will say 10Gb and others 30Gb. Up to you...

    The real interest is to put the data on different dbspaces, which sit on different physical disks, in order to take benefit of the parallel io performed by the engine, or said differently avoid IO contention on one spindle.

  3. #3
    Join Date
    May 2013
    Posts
    5
    Thanks begooden-it,

    I understood, but how the informix's engine handles when we have one dbspace with two chunks each one in a disk different, in this case we have paralel IO or not?

    I always see people putting only one chunk in a dbspace, thats why put more than one chunk within a dbpace looks wrong to me.

    I never saw people doing like that:
    data_dbs ----> dbspace
    data_dbs1 --->chunk for data_dbs on disk 1
    data_dbs2 --->other chunk for data_dbs on disk 2

  4. #4
    Join Date
    Sep 2011
    Location
    Pont l'Abbé, Brittany, France
    Posts
    381
    Provided Answers: 1
    Informix runs query parallelism based on dbspaces, not on chunks.

    The base fact is that you (a user session) can control in which dbspace you can put you data, but not in which chunk you put your data.

    But the checkpoint will sort the data to write by chunk and, according to the number of page cleaners, the chunk writes will effectively be executed in parallel. But once again, the database engine controls this, not the user session.

    It is not a bad practice per-se to have a dbspace with several chunks in it, but there is NO real interest in having a dbspace with several chunks created in different physical disks, excepted if the size you need does not fit in only one physical disk. The main reason is again that the user session cannot control on which chunk the data will go, the engine does.

    The facts can be more complex when using RAID based, because it is yet more difficult to control where the data really are(avoid RAID5if you need IO performance with Informix).

    When you integrate the 2 first points I stated above, this will give you the main guidelines (plus how the applications access to the data) to set up your disk layout.
    Last edited by begooden-it; 05-26-13 at 17:04.

  5. #5
    Join Date
    May 2013
    Posts
    5
    Thanks a lot begooden-it for your clarification about it.
    You are the man

  6. #6
    Join Date
    Sep 2011
    Location
    Pont l'Abbé, Brittany, France
    Posts
    381
    Provided Answers: 1
    My Pleasure :-)

  7. #7
    Join Date
    Feb 2009
    Location
    Brooklyn, NY
    Posts
    9
    bgoliveira,
    Just to chime in and elaborate atop begooden's comments:

    Putting your database on one dbspace spread across several chunks on multiple spindles may help to speed up page-cleaning by running the page cleaners in parallel, one to a chunk.

    However...
    Unless some hardware-aware intelligence has been added in this regard, your PDQ queries will happily stick to one thread because PDQ threads are distributed by dbspace (and tblspace, or course).

    So...
    If you will need PDQ, your better bet is to split your target tables across multiple dbspaces, making as sure as possible that those dbspaces are on separate spindles. (Not easy with these big SAN disk farms.) By the way, another suggestion I have implemented in the past:


    • For the tables: Fragment by expression (but do try to keep the expression simple.)
    • Same fragmentation scheme for the index, but place each index partition on a different spindle from the data it targets.
    • For tables related by referential constraint, try to keep the related master and detail partitions on different spindles.
    • Don't create the PK/FK constraints when you create the table. After you create the table, create a unique index for PK column[s] on the master table, fragmented as I suggested above. Then create an index on the FK column[s] of the detail table. Only *then* do the ALTER TABLE to add PK and FK constraints.

    In my most recent implementation of this, I had only 4 spindles so I fragmented the monster table on spindle[1], [2], [3], [4] but their respective indexes were fragmented on spindle[2], [3], [4], [1]. And the detail table was with the same fragmentation expression was split over spindle[3], [4], [1], [2]. (And yes I played the same game with the FK index.)

    If you are less constrained by disk space you have even more options.

    I used an Excel spreadsheet on the locations of the tables, with formulas generating the ALTER FRAGMENT statements.

    Good luck getting your Unix SA's to create your raw devices on the the separate spindles!

    -- Rasputin Paskudniak (In pursuit of undomesticated, semi-aquatic avians)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •