Unanswered: How to move huge table of binary data?
I am working with some Sybase ASA databases and need some help (I inherited them from the DBA who left the company and we are starting to have are first real issues since that DBA left).
We have some users complaining of slowing performance on their databases. When looking at the databases, they are over 100GB! Since the database is mostly binary data (scanned images and documents) I was looking at moving the image table to a separate dbspace to seperate the 90%+ of images from the remaining oltp type data. I am hoping this would help with the speed since the db engine would spend most of it's time in the smaller datafile and only access the huge image file when adding or viewing images.
Is there an easy way to do this?
1. I tried to unload the database, but the documentation stated Sybase writes out the binary data in hexadecimal (\xnnnn). That is making the data grow on disk by a factor of 5! So my 100GB of binary data should grow close to 500GB! Is there a better way to do this? I don't have a spare 1/2TB laying around.
2. I tried to recreate the table in the new dbspace and then move the data using an insert as select. This seems to work fine, but after dropping the original table I am ending up with 2 100GB+ dbfiles. I tried running a db backup, but it looks like it is going to generate 200GB+ of backup files and I don't have that kind of space. I was hoping I could use the Sybase backup utility to backup the database and then recreated it, but it does not look promising.
Are there any other options? We have several systems out there with databases in the 100GB + range and like this one, the majority of the data is binary data.
Thanks for any help.
Sybase ASA 8.0.2 (I know, out of support. I am pushing for an upgrade asap)
Option one: buy more disk space. 600 bucks will get you a terabyte. You should not not be crammed in without any space to shuffle data around. It's just not that expensive.
Option two: I haven't used Sybase much, but if you have a command-line backup utility, pipe it straight to gzip or bzip. Even low levels of compression will get around the waste of hex encoding. Even if you do get the extra disk space, running gzip with a low level of compression will save time overall because you'll spend less time waiting for the disk.
The problem is that Sybase doesn't free up any space that you've added to your table until you drop the table. You could add 1000 rows of your old document table to your new document table, then drop the 1000 rows from your old document table and in that way free up the space, repeat until all records are moved.
However Sybase keeps the space allocated as unused space to your old document table. (taking a bit of a leap of faith here because this applies to ASE, not sure if ASA has the same problem, if ASA does clear allocated space on delete, the above might be your solution).
Also you have to keep in mind that these actions allocate massive amounts of transaction log, so you're not only in need of 100GB of data, but an equal amount of log. Solution to this is clearing the log after x number of actions.
If you do have 100Gb of free space I would suggest you create a new empty database with a partition for your normal tables, and a separate partition for your documents. Then move all data from your old database to your new database (again, don't move it all at once but in batches of x records so you can clear the log from time to time). Finally drop the old database.
This has the added bonus that your new database is nice and clean.
(note, create table, move data, create referential integrity, create indexs would be a smart order to do things in)
Whether it will solve your performance problems is a totally different question. Indexs on your document table will not be influenced by massive amounts of blobs that are stored on separate pages anyways (again taking a leap of faith on how ASA works compared to ASE).
You may want to look at changing your solution so the images aren't in your database at all, storing binary data like this is easy from an application point of view, but very ineffective. Storing the documents on say, an FTP site or file server, and then putting the location of the document on this storage device in your document table, and then retrieving the documents from the file storage instead of from your database, is much more effective.
There is a risk of the file storage and your database getting out of sync if some nitwit starts to access the files on the file storage directly so you will need to protect that.
Finally, backing up data with the Sybase backup procedure can be done in such a way that it compresses (atleast ASE, just add compress::1:: infront of the file name) however when loading that database you just end up with the same situation as before.
Hope that helps you..
Last edited by BasOlij; 02-03-08 at 03:37.
Reason: forgot a remark
Just to say I agree with BasOlij regards storing the images in the file system rather than in the database. There was a discussion on this before here.
Would it be that difficult to alter the app so it either use the database for the image (as done now) or a link to an external image. When this functionality is in place you could have a program running permanently in the background moving old images which haven't been accessed for months onto the file system. When the database has shrunk sufficiently then you could rebuild the database to reclaim all that space.
If you continue with the current policy of storing images in the database then you're just delaying the time when you need to do this. Out of curiosity - how long does it take to dump a database this size?
Largest I've done so far is 200Gb within 4 hours (both dump and load, 8 hours in total to make a copy of the database on the same server). This is on a fully mirrored system (RAID 1) dumping to separate disks from where the data is stored.