Results 1 to 11 of 11
  1. #1
    Join Date
    Aug 2004
    Posts
    54

    Unanswered: Multiple Data Files

    We have a large Database (91 GB) that is currently in one large data file. Now that we have muliple disk arrays I can split that up on I would like to have a couple data files. My question is, what is the best way to split this up? Should I keep one primary file group and just create another file, or should I create a file group for indexes and put those on that? This database is used for reporting only so it doesn't really have any writes being done on it.

    Thanks much.

  2. #2
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    do you have a lot aof medium size tables, or several very large ones?

    The answer is always..."it depends"
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  3. #3
    Join Date
    Aug 2004
    Posts
    54
    Unfortunately it is pretty much one big table. It is horribly designed.

  4. #4
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    What's the DDL of the table?

    How many physical arrarys do you have?

    Let's say you have 3

    Create 9 partitions for a partitioned view

    put file 1,4,7 on array 1
    put 2,5,8 on array 2
    put 3,6,9 on array 3

    Arrange the partitions so they are approximatley 1 GB each

    The more files on to independant drives the better..and I think I would keep the indexes with the tables...

    Anyone?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  5. #5
    Join Date
    Aug 2004
    Posts
    54
    Well, here is the DDL, it is a beast though.

    Code:
    CREATE TABLE [dbo].[CLARITY_TDL] (
    	[TDL_ID] [numeric](19, 0) NOT NULL ,
    	[DETAIL_TYPE] [int] NOT NULL ,
    	[TYPE] [int] NOT NULL ,
    	[POST_DATE] [datetime] NULL ,
    	[ORIG_POST_DATE] [datetime] NULL ,
    	[ORIG_SERVICE_DATE] [datetime] NULL ,
    	[PERIOD] [varchar] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[TX_ID] [int] NULL ,
    	[TRAN_TYPE] [int] NULL ,
    	[ALLOWED_AMOUNT] [numeric](12, 2) NULL ,
    	[CHARGE_SLIP_NUMBER] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[TYPE_OF_SERVICE] [int] NULL ,
    	[MATCH_TRX_ID] [int] NULL ,
    	[MATCH_TX_TYPE] [int] NULL ,
    	[MATCH_PROC_ID] [int] NULL ,
    	[MATCH_PROV_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[MATCH_LOC_ID] [int] NULL ,
    	[ACCOUNT_ID] [int] NULL ,
    	[PAT_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[AMOUNT] [numeric](12, 2) NULL ,
    	[PATIENT_AMOUNT] [numeric](12, 2) NULL ,
    	[INSURANCE_AMOUNT] [numeric](12, 2) NULL ,
    	[RELATIVE_VALUE_UNIT] [numeric](12, 2) NULL ,
    	[CUR_CVG_ID] [int] NULL ,
    	[CUR_PLAN_ID] [int] NULL ,
    	[CUR_PAYOR_ID] [int] NULL ,
    	[CUR_FIN_CLASS] [int] NULL ,
    	[PERFORMING_PROV_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[BILLING_PROVIDER_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[ORIGINAL_CVG_ID] [int] NULL ,
    	[ORIGINAL_PLAN_ID] [int] NULL ,
    	[ORIGINAL_PAYOR_ID] [int] NULL ,
    	[ORIGINAL_FIN_CLASS] [int] NULL ,
    	[PROC_ID] [int] NULL ,
    	[PROCEDURE_QUANTITY] [int] NULL ,
    	[CPT_CODE] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[MODIFIER_ONE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[MODIFIER_TWO] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[MODIFIER_THREE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[MODIFIER_FOUR] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[DX_ONE_ID] [int] NULL ,
    	[DX_TWO_ID] [int] NULL ,
    	[DX_THREE_ID] [int] NULL ,
    	[DX_FOUR_ID] [int] NULL ,
    	[DX_FIVE_ID] [int] NULL ,
    	[DX_SIX_ID] [int] NULL ,
    	[SERV_AREA_ID] [int] NULL ,
    	[LOC_ID] [int] NULL ,
    	[DEPT_ID] [int] NULL ,
    	[POS_ID] [int] NULL ,
    	[CUSTOMER_ITEM_ONE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[CUSTOMER_ITEM_TWO] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[CUSTOMER_ITEM_THREE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[CUSTOMER_ITEM_FOUR] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[CUSTOMER_ITEM_FIVE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[BILL_CLAIM_AMOUNT] [numeric](12, 2) NULL ,
    	[INVOICE_NUMBER] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[ABF_FORM_ID] [int] NULL ,
    	[CLM_CLAIM_ID] [int] NULL ,
    	[BILL_HOLD_AMOUNT] [numeric](12, 2) NULL ,
    	[PAT_AGING_DAYS] [int] NULL ,
    	[INS_AGING_DAYS] [int] NULL ,
    	[ACTION_CVG_ID] [int] NULL ,
    	[ACTION_PLAN_ID] [int] NULL ,
    	[ACTION_PAYOR_ID] [int] NULL ,
    	[ACTION_FIN_CLASS] [int] NULL ,
    	[DEBIT_GL_NUM] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[CREDIT_GL_NUM] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[REASON_CODE_ID] [int] NULL ,
    	[USER_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[COPAY_INDICATOR] [int] NULL ,
    	[TX_NUM] [int] NULL ,
    	[ORIG_PRICE] [numeric](12, 2) NULL ,
    	[PRICE_CONTRACT_ID] [int] NULL ,
    	[CONTRACT_DISCOUNT] [numeric](12, 2) NULL ,
    	[INT_PAT_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[ORIG_AMT] [numeric](12, 2) NULL ,
    	[PAT_TYPE_C] [int] NULL ,
    	[PRIM_CARE_PROV] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[REFERRAL_ID] [int] NULL ,
    	[REFERRAL_SOURCE_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[PRIM_LOCATION] [int] NULL ,
    	[RVU_WORK] [numeric](12, 2) NULL ,
    	[RVU_OVERHEAD] [numeric](12, 2) NULL ,
    	[RVU_MALPRACTICE] [numeric](12, 2) NULL ,
    	[POSTING_BATCH_NUM] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[VOID_USER_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[R_NEW_CHG_TX_ID] [int] NULL ,
    	[R_ORIG_CHG_TX_ID] [int] NULL ,
    	[R_ORIG_POST_DATE] [datetime] NULL ,
    	[R_ORIG_AMOUNT] [numeric](12, 2) NULL ,
    	[R_ORIG_PAT_AMOUNT] [numeric](12, 2) NULL ,
    	[R_ORIG_FIN_CLASS] [int] NULL ,
    	[R_ORIG_PAYOR_ID] [int] NULL ,
    	[R_ORIG_PLAN_ID] [int] NULL ,
    	[R_ORIG_CVG_ID] [int] NULL ,
    	[REFERENCE_NUMBER] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[PAYMENT_SOURCE_C] [int] NULL ,
    	[TX_COMMENT] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[ANESTHESIA_TYPE_C] [int] NULL ,
    	[EMERGENCY_STATUS_C] [int] NULL ,
    	[PHYSICAL_STATUS_C] [int] NULL ,
    	[CONCURRENCY_CALC] [numeric](6, 2) NULL ,
    	[CONCURRENCY_OVER] [numeric](6, 2) NULL ,
    	[BASE_UNITS] [numeric](6, 2) NULL ,
    	[TIMED_UNITS] [numeric](6, 2) NULL ,
    	[PHY_STATUS_UNITS] [numeric](6, 2) NULL ,
    	[EMERGENCY_UNITS] [numeric](6, 2) NULL ,
    	[AGE_UNITS] [numeric](6, 2) NULL ,
    	[MED_SUP_MOD_OVER_C] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[ANESTHESIA_CERT] [varchar] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[TECH_CHARGE_FLAG] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[NIA_OUTCOME_CODE_C] [int] NULL ,
    	[PROC_MED_NEC_FLAG] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[CRD_CHARGE_SLIP_NO] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[VISIT_NUMBER] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[CRNA_CHARGE_ID] [int] NULL ,
    	[ORIG_ANES_CHG_ID] [int] NULL ,
    	[CONTESTED_FLAG_YN] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[PAT_ENC_CSN_ID] [int] NULL ,
    	[ACTION_USER_ID] [varchar] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[TDL_EXTRACT_DATE] [datetime] NULL ,
    	[INSURANCE_ID] [varchar] (17) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[SPECIMEN_ID] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[TEST_ID] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[LAB_ID] [varchar] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[PANEL_ID] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    	[PROV_SPECIALTY_C] [int] NULL 
    ) ON [PRIMARY]
    GO
    
    ALTER TABLE [dbo].[CLARITY_TDL] WITH NOCHECK ADD 
    	CONSTRAINT [PK__CLARITY_TDL__746F28F1] PRIMARY KEY  CLUSTERED 
    	(
    		[TDL_ID]
    	) WITH  FILLFACTOR = 80  ON [PRIMARY] 
    GO
    
     CREATE  INDEX [CLARITY_TDL__DETAIL_TYPE] ON [dbo].[CLARITY_TDL]([DETAIL_TYPE]) WITH  FILLFACTOR = 80,  PAD_INDEX  ON [PRIMARY]
    GO
    
     CREATE  INDEX [CLARITY_TDL__POST_DATE] ON [dbo].[CLARITY_TDL]([POST_DATE]) WITH  FILLFACTOR = 80,  PAD_INDEX  ON [PRIMARY]
    GO
    
     CREATE  INDEX [CLARITY_TDL__PERIOD] ON [dbo].[CLARITY_TDL]([PERIOD]) WITH  FILLFACTOR = 80,  PAD_INDEX  ON [PRIMARY]
    GO
    
     CREATE  INDEX [CLARITY_TDL__SERV_AREA_ID] ON [dbo].[CLARITY_TDL]([SERV_AREA_ID]) WITH  FILLFACTOR = 80,  PAD_INDEX  ON [PRIMARY]
    GO
    
     CREATE  INDEX [CLARITY_TDL__ORIG_POST_DATE] ON [dbo].[CLARITY_TDL]([ORIG_POST_DATE]) WITH  FILLFACTOR = 80,  PAD_INDEX  ON [PRIMARY]
    GO
    
     CREATE  INDEX [CLARITY_TDL__TX_ID] ON [dbo].[CLARITY_TDL]([TX_ID]) WITH  FILLFACTOR = 80,  PAD_INDEX  ON [PRIMARY]
    GO
    
     CREATE  INDEX [CLARITY_TDL__ORIG_SERVICE_DATE] ON [dbo].[CLARITY_TDL]([ORIG_SERVICE_DATE]) WITH  FILLFACTOR = 80,  PAD_INDEX  ON [PRIMARY]
    GO
    We do have 3 physical disk arrays for this.

    What is the advantage of putting multiple files on one array? That wouldn't decrease the load at all.

  6. #6
    Join Date
    Jan 2005
    Location
    TempDb
    Posts
    228
    I think Brett's idea of splitting out the filegroups is to increase the odds that subsequent reads will come from a different array. It doesn't decrease your load, but it should improve the distribution of your load across your arrays. I don't know what the optimum number is...

    The only thing that jumps out at me, other than the fact most of your columns are NULL, is that you are leaving 20% of your pages empty because you don't really have any writes. My guess is your writes are done based on your PK, TDL_ID, which I'm also guessing is a monotonically increasing value that will result in all your inserts being done at the end of your page chain. Thus, your pages stay 20% empty.

    If these assumptions are correct, changing your FILLFACTOR to 100% will decrease your i/o's by ~ 20%.

  7. #7
    Join Date
    Aug 2004
    Posts
    54
    I left the pages at 20% empty because we get a fairly large load on the database every night (about 50 MB). TDL_ID is however monotonically increasing, so you are right, the inserts would be at the end of the chain, I never thought of that. Thanks for the help.

  8. #8
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    It'll increase the number of threads due to parallelism AND will reduce contention because the file will be on different arrays. And by placing the "next of kin" data on separate arrays, I'm assuming that a particular request for data that is "close" will also thread out.

    I was just reading about Federated Servers, but I've never used them, and I suspect that not many people do.

    Since they're also using partitioned views as well, I don't see how dragging the network in to the equation is a plus.

    I would just try the partitioning on the single box first.

    Actually, I would consider normalizing the table first.
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  9. #9
    Join Date
    Aug 2004
    Posts
    54
    Great, thanks for all of your help on this.

    I would love to normalize that table, but unfortunately it is out of my hands. That table is created and used by a third party product. I know it is ugly, but there isn’t much I can do about it.

  10. #10
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    I wonder how many level you can do partitioning...I'll have to play with that..

    BUT...if you could, you could partition out the data, AND create another view that is a join to normalized tables and a partition view.

    You would just have the final view as the exact same name as your messy table.

    AND, can someone PLEASE tell me why all 3rd party stuff looks like crap all the time?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  11. #11
    Join Date
    Aug 2004
    Posts
    54
    That is a good idea. I have been wanting to normalize that thing since the first time I saw it.

    Wouldn't that have a problem with inserts though? The data is also loaded through a third party product.

    If I partition the table and use a view to UNION them all together, the load would have to be changed, correct? That is why I originally just thought I would put all the indexes on a seperate data file.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •