I've got a large object question that I hope you guys can help with.
I'm working on a server-side (C lang. functions) application that works with very large objects. A single object can hit several gigs. Working with large objects, I've noticed that they're stored in pg_largeobject as chunks of (by default) 2kb. Using this default scheme would produce insane amounts of rows for any one object. From testing, copying a 25meg blob to a new blob is not fast. This was done in a C lang. func. by reading from a blob then writing to a new one in 7meg chunks (bigger chunks failed).
One idea we had was to use the server's filesystem to store the actual data and only keep information about it in the tables. To do this, 2 tables were defined:
CREATE TABLE BlobSpace (
blobspace_id SERIAL PRIMARY KEY,
blobspace_name VARCHAR(128) NOT NULL,
blobspace_root VARCHAR(256) NOT NULL,
max_size BIGINT DEFAULT 0,
size_used BIGINT DEFAULT 0
CREATE TABLE Blobs (
blob_id SERIAL PRIMARY KEY,
blobspace_id INTEGER NOT NULL,
blob_size BIGINT NOT NULL,
Blobspace defines a space on the filesystem (directory, drive, etc) available to store blobs and how much space is available/free. Blobs holds the filename for the blob and which blobspace it resides in. When a blob is inserted, it gets saved into a file in the appropriate blobspace and the information is saved in these tables. The blob_id is then saved in whatever table needs the blob.
The environment is PostgreSQL 7.3.2 on Red Hat 7.3.
Does this scheme make sense and is it really feasible? Or am I way off base? Better yet, has this been implemented before?
Any suggestions or other possible schemes?