Results 1 to 8 of 8
  1. #1
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527

    Unanswered: Ideas for an automated database checker

    I'm sure this has been done before (probably 100's of times) but I'm producing a script to run on a database that will point out as many possible issues as it can ie
    • tables without indexes
    • tables with too many indexes
    • exceptions to naming conventions for tables and procs
    • procs using cursors or forceplan
    • tabless and procs that should have been deleted
    • standard fields that are defined differently from table to table (or procs)
    • etc etc
    Some of the above issues may well be ok but it should give me a quick pointer at where I'm likely to find issues to look at. Does anyone have any other ideas I could add to these checks?

    Mike

  2. #2
    Join Date
    Sep 2003
    Location
    Switzerland
    Posts
    443
    Quote Originally Posted by mike_bike_kite
    I'm sure this has been done before (probably 100's of times) but I'm producing a script to run on a database that will point out as many possible issues as it can ie
    • tables without indexes
    • tables with too many indexes
    • exceptions to naming conventions for tables and procs
    • procs using cursors or forceplan
    • tabless and procs that should have been deleted
    • standard fields that are defined differently from table to table (or procs)
    • etc etc
    Some of the above issues may well be ok but it should give me a quick pointer at where I'm likely to find issues to look at. Does anyone have any other ideas I could add to these checks?

    Mike
    Hi Mike,

    Real good initiative. I do know that theres 1000s of such checks on the server level but I dont think theres too many initiatives which is intented to look at the DDL level in a development lifecycle point of view.

    The problems you might face is on the proc level. Rest all is doable I think.

    There should be a way to get out a showplan of the proc and start parsing through it for bad no-no stuff, like some below.

    A) Procs which are doing table scan.
    B) For some of the issues, you might have to turn auditing which will help
    C) Procs with like %%
    D) Procs which uses tempdb. The ones which could potentially fill it up. This is not that easy to figure out. You can figure out the ones with the #, but the queries which create worktables is not that easy to track.

  3. #3
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Thanks for the ideas - they may well be good to add but I suspect some may be a little difficult to implement . I can easily do point C though. My system is currently just looking at the system tables (sysobjects, syscolumns, syscomments, sysindexes and systypes) and it's been straightforward to implement everything so far.

    I'm adding some ideas on the spread of values held in index fields (ie avoiding indexes on male/female type fields) and perhaps reporting on whether a table is empty or has over 10m rows.

    Mike

  4. #4
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I have the system running now and it produces quite reasonable results. It only runs on Sybase at the moment (and I'd assume MS SQL Server) but I might try getting it working on MySQL as well. It takes about 5 mins to analysis a moderate sized database but larger databases do take longer. I've shown a cut down version of the final report but changed all the names to protect the innocent.

    The table structure issues are really a matter of taste - I don't like huge monolithic tables and prefer putting tables with similar structures together into one table with an additional type field. The table naming conventions for a given database are worked out on the fly - it will see that most of the tables end with Tb and then comment on tables that don't end in Tb etc. The overall score isn't meant to mean too much but allows people to see if their databases are improving.

    I may put this out as a web page so anyone could have their database checked but I'm unsure whether people would be happy entering in database connection details via a web page though. The program only needs to be able to read a few system tables and create temp tables. Would anybody use it? - remember it would only work on Sybase/SQL Server and may be MySQL.

    Any additional ideas are welcome or comments on the current reporting.

    Mike

    Code:
        Analysis of MyDb on MyServer at 29 Nov 07 09:01
    
    Naming Standards issues
       Proc jestest  - naming standards - may be temp
       Table BackUpEndBatchTb  - may be temp
       Table DatesTb_2  - naming standards
       Table agus  - naming standards
       Table hda  - naming standards
       Table pbcaatcol  - naming standards - used by pb_catcol
       Table temp_fab  - naming standards - may be temp
    
    Performance issues
       Proc AsxEqtyCalcStp  - uses forceplan - indexes not used
       Proc BR01Stp  - uses cursors
       Table AmortSwapTb  - no indexes
       Table AreaTb  - no data
       Table AvgCustTb  - no data
       Table BasketBrkDwnTb  - no indexes
       Table CustSavedTb  - no indexes (175k rows)
       Table CustomDebtTb  - too many indexes
    
    Table structure issues
       Table FilTb  - 83 columns
       Table OrigTb  - 61 columns
       Table SfabTypTb  - has repeating fields
       Table DebtTb is the same as DebtPosTb
       Table CustomDebtTb is the same as CustDebtSavedTb
       Table OrigTb is almost the same as DebtPosTb
    
    Field definition issues
       Field Act is char(8) NULL in 5 tables/procs but ...
          Table KeyDtaTb uses char(1) NULL
    
       Field Ccy is CcyTyp in 11 tables/procs but ...
          Table IdxCnsTb uses char(3) NULL
          Table TrmsPosTb uses varchar(3) NULL
    
       Field AreaC is numeric(4) NULL in 255 tables/procs but ...
          Proc FraGrossExpoStp uses FloatTyp
          Proc GrossExpoStp uses FloatTyp
          Proc SensRiskStp uses IntTyp
          Proc SensRptStp uses IntTyp
          Proc CrrRiskStp uses char(6) NULL
          Proc EqtyIntCalcStp uses decimal  NULL
          Proc TstStp uses decimal  NULL
          Proc GetDerivStp uses float  NULL
          Table EndBatchTb uses int  NULL
          Table FrntEndBatchTb uses int  NULL
          Proc AdjPosStp uses numeric(9) NULL
    
    Note
       Table SecurityMapTb contains duplicate rows
       Index on field CustTb.IsaF has too little variance
    
                Overall score for MyDb is 43%

  5. #5
    Join Date
    Sep 2003
    Location
    Switzerland
    Posts
    443
    Quote Originally Posted by mike_bike_kite
    I may put this out as a web page so anyone could have their database checked but I'm unsure whether people would be happy entering in database connection details via a web page though.
    Great job mate. I doubt people would use it thro web. But you can maybe create it as a standalone executable and charge OR have it free. I would start with free until it gets stable enough.

  6. #6
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Do you fancy being a test pilot then?
    If so, do use just sybase or sql server as well?
    Mike

  7. #7
    Join Date
    Sep 2003
    Location
    Switzerland
    Posts
    443
    See private message

  8. #8
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    The automated database analysis program is working quite well now, it shows up a lot of issues on even the best designed databases. It looks for:
    • performance issues
    • field definition issues
    • poor naming conventions
    • old unused tables
    • duplicate data
    • etc etc

    I've tried to make it as generic as possible but you'll need UNIX, Sybase (I believe it will work for MS SQL Server as well) and isql. It takes approx 10 minutes to run. I'd suggest not running it on your live database if only for peace of mind. It doesn't make any changes to the database and no account details are kept so I'm afraid you need to enter this data each time. If it proves popular I'll produce a more professional version.

    http://www.bikesandkites.com/DbHealthCheck.php

    It gives an overall score for the database - a high score doesn't imply you have a brilliant design but anything below 50% implies there are a lot of issues that need sorting! Older databases will usually have more issues than newer designs. Obviously it's still early days so go easy on it but I'd appreciate any feedback or suggestions. Good luck

    Mike
    Last edited by mike_bike_kite; 02-18-08 at 05:57.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •