At 1 of my clients I am in charge of "DB2 content manager" and "DB2 content manager on demand" systems. a.k.a. ECM systems. Both systems use the db2 database to keep track of all the documents (index) and the location where the actual documents are located. The documents themselves are stored outside of the database on the filesystem (later on in a black-box called TSM

).
Nice concept, performance is allright but backup/rerstore/revocery/consistancy is a nightmare! A db2 backup is no problem, but a backup of your filesystem containing millions of documents takes foreever. No one has ever been able to explain to me why ECM-systems do not use the database to store the documents as BLOB's. Yes, BLOB's are 'slower' because db2 has to do an extra I/O outside of the bufferpools to obtain the data.. but the ECM system itself has to do exactly the same: locate the document on disk and retrieve it.... Is that really that much faster?? I doubt that.
And what about storage: I'd rather manage disks with a "few" very large tablespace containers instead of disks with millions of small files.
But still: most of the ECM vendors choose to store it outside of the database (even IBM and they have DB2 at their disposal). Any one out there who understands the concept and is able to explain it to me?