Quote:
Originally Posted by pootle flump
500,000 of 10kb is orders of magnitude different to 1,000,000 of 100MB. To plan capacity you will need to zero in on to more accurate numbers.
|
You're absolutely right. But unfortunately it really does vary just that much. Right now we're consuming 10 GB, but in the future I'd expect that to raise. It is very hard to pin down just due to the kind of data that comes in.
Quote:
Originally Posted by pootle flump
I'm afraid I don't know what a collapsing table is.
|
I guess that is worded wrong. What I want to avoid is having the database throw an exception because the table has exceeded its capacity.
Quote:
Originally Posted by pootle flump
What is your RDBMS?
|
Likely Postgres.
Quote:
Originally Posted by pootle flump
Do you have hardware?
|
A mid-range dedicated 2008 server.
Quote:
Originally Posted by pootle flump
What is the purpose of the database?
|
To expose the data (and meta data) to a web-service which in term exposes it internally. The web-service would relay the raw data and meta data depending on the request.
Quote:
Originally Posted by pootle flump
Is it pretty static and will only a blob or two at a time be retrieved, or is it a highly transactional system that will have many thousands of blobs affected by each transaction?
|
Fairly static. I'd say 90% of them won't change in any one day, most requests will be for only one blob of data, but for multiple piece of meta data.
So for example you might have a request that downloads one hundred meta data rows, but then only downloads one blob of data as a end result.
Quote:
Originally Posted by pootle flump
Also, are these blobs currently on a filesystem and, if so, what is the driver to move them into the database? (the reason I ask is that moving the blobs in to the database is a design decision that is not explained by your post).
|
We have hundreds of thousands of little files that are almost impossible to backup (due to the overhead created by moving each one file). We also want to expose this data as a web-service which, granted, we can do with the file system, it would be easier to implement if we can write queries against a dataset rather than having to use the file system's structure.