I'm sure this has been done before (probably 100's of times) but I'm producing a script to run on a database that will point out as many possible issues as it can ie
tables without indexes
tables with too many indexes
exceptions to naming conventions for tables and procs
procs using cursors or forceplan
tabless and procs that should have been deleted
standard fields that are defined differently from table to table (or procs)
Some of the above issues may well be ok but it should give me a quick pointer at where I'm likely to find issues to look at. Does anyone have any other ideas I could add to these checks?
Real good initiative. I do know that theres 1000s of such checks on the server level but I dont think theres too many initiatives which is intented to look at the DDL level in a development lifecycle point of view.
The problems you might face is on the proc level. Rest all is doable I think.
There should be a way to get out a showplan of the proc and start parsing through it for bad no-no stuff, like some below.
A) Procs which are doing table scan.
B) For some of the issues, you might have to turn auditing which will help
C) Procs with like %%
D) Procs which uses tempdb. The ones which could potentially fill it up. This is not that easy to figure out. You can figure out the ones with the #, but the queries which create worktables is not that easy to track.
Thanks for the ideas - they may well be good to add but I suspect some may be a little difficult to implement . I can easily do point C though. My system is currently just looking at the system tables (sysobjects, syscolumns, syscomments, sysindexes and systypes) and it's been straightforward to implement everything so far.
I'm adding some ideas on the spread of values held in index fields (ie avoiding indexes on male/female type fields) and perhaps reporting on whether a table is empty or has over 10m rows.
I have the system running now and it produces quite reasonable results. It only runs on Sybase at the moment (and I'd assume MS SQL Server) but I might try getting it working on MySQL as well. It takes about 5 mins to analysis a moderate sized database but larger databases do take longer. I've shown a cut down version of the final report but changed all the names to protect the innocent.
The table structure issues are really a matter of taste - I don't like huge monolithic tables and prefer putting tables with similar structures together into one table with an additional type field. The table naming conventions for a given database are worked out on the fly - it will see that most of the tables end with Tb and then comment on tables that don't end in Tb etc. The overall score isn't meant to mean too much but allows people to see if their databases are improving.
I may put this out as a web page so anyone could have their database checked but I'm unsure whether people would be happy entering in database connection details via a web page though. The program only needs to be able to read a few system tables and create temp tables. Would anybody use it? - remember it would only work on Sybase/SQL Server and may be MySQL.
Any additional ideas are welcome or comments on the current reporting.
Analysis of MyDb on MyServer at 29 Nov 07 09:01
Naming Standards issues
Proc jestest - naming standards - may be temp
Table BackUpEndBatchTb - may be temp
Table DatesTb_2 - naming standards
Table agus - naming standards
Table hda - naming standards
Table pbcaatcol - naming standards - used by pb_catcol
Table temp_fab - naming standards - may be temp
Proc AsxEqtyCalcStp - uses forceplan - indexes not used
Proc BR01Stp - uses cursors
Table AmortSwapTb - no indexes
Table AreaTb - no data
Table AvgCustTb - no data
Table BasketBrkDwnTb - no indexes
Table CustSavedTb - no indexes (175k rows)
Table CustomDebtTb - too many indexes
Table structure issues
Table FilTb - 83 columns
Table OrigTb - 61 columns
Table SfabTypTb - has repeating fields
Table DebtTb is the same as DebtPosTb
Table CustomDebtTb is the same as CustDebtSavedTb
Table OrigTb is almost the same as DebtPosTb
Field definition issues
Field Act is char(8) NULL in 5 tables/procs but ...
Table KeyDtaTb uses char(1) NULL
Field Ccy is CcyTyp in 11 tables/procs but ...
Table IdxCnsTb uses char(3) NULL
Table TrmsPosTb uses varchar(3) NULL
Field AreaC is numeric(4) NULL in 255 tables/procs but ...
Proc FraGrossExpoStp uses FloatTyp
Proc GrossExpoStp uses FloatTyp
Proc SensRiskStp uses IntTyp
Proc SensRptStp uses IntTyp
Proc CrrRiskStp uses char(6) NULL
Proc EqtyIntCalcStp uses decimal NULL
Proc TstStp uses decimal NULL
Proc GetDerivStp uses float NULL
Table EndBatchTb uses int NULL
Table FrntEndBatchTb uses int NULL
Proc AdjPosStp uses numeric(9) NULL
Table SecurityMapTb contains duplicate rows
Index on field CustTb.IsaF has too little variance
Overall score for MyDb is 43%
The automated database analysis program is working quite well now, it shows up a lot of issues on even the best designed databases. It looks for:
field definition issues
poor naming conventions
old unused tables
I've tried to make it as generic as possible but you'll need UNIX, Sybase (I believe it will work for MS SQL Server as well) and isql. It takes approx 10 minutes to run. I'd suggest not running it on your live database if only for peace of mind. It doesn't make any changes to the database and no account details are kept so I'm afraid you need to enter this data each time. If it proves popular I'll produce a more professional version.
It gives an overall score for the database - a high score doesn't imply you have a brilliant design but anything below 50% implies there are a lot of issues that need sorting! Older databases will usually have more issues than newer designs. Obviously it's still early days so go easy on it but I'd appreciate any feedback or suggestions. Good luck